What Is a CSV File?
A CSV file (Comma Separated Values file) is a type of plain text file
that uses specific structuring to arrange tabular data. Because it’s a plain
text file, it can contain only actual text data—in other words, printable ASCII or Unicode characters.
The structure of a CSV file is given away by its name. Normally, CSV
files use a comma to separate each specific data value.
If you open the above CSV file using a text editor such as
sublime text, you will see:
ItemName,Quantity,Price
LED,5,30000
LCD,6,24000
Woofer,10,3000
Laptop,3,30000
desktop,3,25000
headphone,2,1400
CRT,5,30000
TFt,6,24000
Sound,10,300
CD Writer,3,5000
DVD Writer,3,450
Smart Mobile,2,1400
DataTransfer between dataFrames and .csv file
CSV format is a kind of tabular data
separated by comma and is stored in the form of plan text.
In CSV Format/propeties :
- One line for each record
- Comma
separated fields
- Space-characters
adjacent to commas are ignored
- Fields with
in-built commas are separated by double quote characters.
- Fields with
double quote characters must be surrounded by double quotes. Each inbuilt
double quote must be represented by a pair of consecutive quotes
- Fields that
contain inbuilt line-breaks must be surrounded by double quotes
When Use CSV?
- When data has
a strict tabular structure
- To transfer
large database between programs
- To import and
export data to office applications, Qedoc modules
- To store,
manage and modify shopping cart catalogue
CSV Advantages
- CSV is human
readable and easy to edit manually
- CSV is simple
to implement and parse
- CSV is
processed by almost all existing applications
- CSV provides
a straightforward information schema
- CSV is faster
to handle
- CSV is
smaller in size
- CSV is
considered to be standard format
- CSV is
compact. For XML you start tag and end tag for each column in each row. In
CSV you write the column headers only once.
- CSV is easy
to generate
CSV Disadvantages
- CSV allows to
move most basic data only. Complex configurations cannot be imported and
exported this way
- There is no
distinction between text and numeric values
- No standard
way to represent binary data
- Problems with
importing CSV into SQL (no distinction between NULL and quotes)
- Poor support
of special characters
- No standard
way to represent control characters
- Lack of
universal standard
Write a CSV File
You
can save your Pandas DataFrame as
a CSV file with .to_csv():
Example:
import
pandas as pd
SData={"name":['Jyotika','Muskan','Amit','Anchit','Aviral','Mahi'],\
'Accounts':[54,76,98,54,76,87],'English':[89,87,54,89,43,67],\
'Bst':[65,67,87,56,87,54]}
print("Convert
dictionary to dataframe")
df=pd.DataFrame(SData)
print(df)
df.to_csv('data.csv')
That’s it! You’ve created the file data.csv in your current working directory.
Read a CSV File
Once your data is saved in a CSV file, you’ll likely want to load and
use it from time to time. You can do that with the Pandas read_csv() function:
Example:
import pandas as pd
df =
pd.read_csv('data.csv', index_col=0)
print(df)
Output:
Example : To display the shape (number
of rows/record and columns/fields) of the CSV file.
>>>
row,column=df.shape
>>> row
6
>>> column
4
>>>
Reading CSV file with specific/selected columns :
While working with large tables in CSV format,
there can be several columns contained in it. But you may require selective
columns to be read into a dataframe. This can be done by using “usecols”
attribute or option along with read_csv() method.
Example :
import
pandas as pd
df =
pd.read_csv('data.csv', usecols=['name','English'])
print(df)
Output:
Reading CSV file with specific/selected rows :
Like columns , there can be thousands of records in
a CSV file. You can display selective records/rows using “nrows” option
or attribute use with read_csv() method.
Example :
import
pandas as pd
df =
pd.read_csv('data.csv', nrows=3, index_col=0)
print(df)
Output:
Reading CSV
file without header :
If you can do not want to display the first
row as the header for dataframe using Employee table, then this can be done by
specifying None argument for ‘header’ option or ‘skiprows’ option using read_csv() method.
import pandas
as pd
df = pd.read_csv('data.csv',
nrows=3, header=None)
print(df)
Output:
Reading CSV file with new column names
You can read
and load data from CSV file into a dataframe with new column names(rename)
while reading the .csv file. If the header exists, you have to skip it using
skiprows option along with names options for renaming the columns.
Example :
import
pandas as pd
df =
pd.read_csv('data.csv',index_col=0)
print(df)
print('\nReading
CSV file with rename column \n')
df =
pd.read_csv('data.csv',skiprows=1,names=['Sname','Acct','Eng','BusStudies',])
print(df)
Output:
No comments:
Post a Comment