Python Tutorial: Data File Handling

Sunday, 25 July 2021

Data File Handling

 What is a file?

A file is a named location on a secondary storage media where data are permanently stored for later access.

Types of files

Computers store every file as a collection of 0s and 1s i.e., in binary form. Therefore, every file is basically just a series of bytes stored one after the other. There are mainly two types of data files —

·         text file  

·         binary file

 

Text file

·         A text file contains only textual information consisting of alphabets, numbers and other special symbols. Such files are stored with extensions like .txt, .py, .c, .csv, .html, etc. Each byte of a text file represents a character.

·         Each line of a text file is stored as a sequence of ASCII equivalent of the characters and is terminated by a special character, called the End of Line (EOL).

·         Contents in a text file are usually separated by whitespace, but comma (,) and tab (\t) are also commonly used to separate values in a text file.

 

Binary Files

 

·         Binary files are also stored in terms of bytes (0s and 1s), but unlike text files, these bytes do not represent the ASCII values of characters. Rather, they represent the actual content such as image, audio, video, compressed versions of other files, executable files, etc. These files are not human readable. Thus, trying to open a binary file using a text editor will show some garbage values. We need specific software to read or write the contents of a binary file.

·         Binary files are stored in a computer in a sequence of bytes. Even a single bit change can corrupt the file and make it unreadable to the supporting application. Also, it is difficult to remove any error which may occur in the binary file as the stored contents are not human readable. We can read and write both text and binary files through Python programs.

·          

Opening and Closing a Text File

·         In real world applications, computer programs deal with data coming from different sources like databases, CSV files, HTML, XML, JSON, etc. We broadly access files either to write or read data from it. But operations on files include creating and opening a file, writing data in a file, traversing a file, reading data from a file and so on. Python has the io module that contains different functions for handling files.

Opening a file

    • The open function has the following syntax:
    • Open a text file: Syntax:<file object> = open(file_name,access_mode)
      • file object : It is just like a variable or object
      • open(): It is a function with two parameters. 
      • file_name: It accepts a file name with .txt extension.
      • access_mode: It specifies the mode to access the file. The default mode is reading mode.
        • These modes are 
          • r: to read a file        
          • w: to write          
          • a: append contents

Closing a file

close() method is used to close the file. While closing a file, the system frees up all the resources like processor and memory allocated to it. The syntax of close() is:

file_object.close()

Here, file_object is the object that was returned while opening the file.

Opening a file using with clause

In Python, we can also open a file using with clause. The syntax of with clause is:

with open (file_name, access_mode) as file_ object:

The advantage of using with clause is that any file that is opened using this clause is closed automatically, once the control comes outside the with clause. In case the user forgets to close the file explicitly or if an exception occurs, the file is closed automatically. Also, it provides a simpler syntax.

with open(“myfile.txt”,”r+”) as myObject:

content = myObject.read()

Here, we don’t have to close the file explicitly using close() statement. Python will automatically close the file.

Writing to a Text File

For writing to a file, we first need to open it in write or append mode. If we open an existing file in write mode, the previous data will be erased, and the file object will be positioned at the beginning of the file. On the other hand, in append mode, new data will be added at the end of  the previous  data as the file object of the file. After opening the file, we can use the following methods to write in the file.

·         write()- for writing single file.

·         writeline()- for writing sequence of string

 

The write() method

write() method takes a string as an argument and writes it to the text file. It returns the number of characters being written on single execution of the write() method. Also, we need to add a newline character (\n) at the end of every sentence to mark the end of line.

 

Consider the following piece of code:

For storing numeric data value in a text file, conversion to string is required.

 

The writelines() method

This method is used to write multiple strings to a file. We need to pass an iterable object like lists, tuple, etc. containing strings to the writelines() method. Unlike  write(), the writelines() method does not return the number of characters written in the file. The following code explains the use of writelines().

 

Program to add list items to a file using writelines() method.

 

 

Reading from a Text File

We can write a program to read the contents of a file. Before reading a file, we must make sure that the file is opened in “r”, “r+”, “w+” or “a+” mode. There are three ways to read the contents of a file:

2.5.1 The read() method

This method is used to read a specified number of bytes of data from a data file. The syntax of read() method is:

file_object.read(n)

Consider the following set of statements to understand the usage of read() method:

 

 

 

The readline([n]) method

This method reads one complete line from a file where each line terminates with a newline (\n) character. It can also be used to read a specified number (n) of bytes of data from a file but maximum up to the newline character (\n). In the following example, the second statement reads the first ten characters of the first line of the text file and displays them on the screen.

Syntax:

fileObject.readline()

 

 

If no argument or a negative number is specified, it reads a complete line and returns string.

 

The readlines() method

The method reads all the lines and returns the lines along with newline as a list of strings. The following example uses readlines() to read data from the text file test_file.txt.

 

No comments:

Post a Comment