3.1415926535
8979323846
2643383279
First we store the name of the file we’re reading from in the variable
filename. This is a common convention when working with files. Because
the variable filename doesn’t represent the actual file—it’s just a string telling Python where to find the file—you can easily swap out 'pi_digits.txt'
for the name of another file you want to work with. After we call open(),
an object representing the file and its contents is stored in the variable
file_object. We again use the with syntax to let Python open and close
the file properly. To examine the file’s contents, we work through each line
in the file by looping over the file object.
Making a List of Lines from a File
When you use with, the file object returned by open() is only available inside
the with block that contains it. If you want to retain access to a file’s contents
outside the with block, you can store the file’s lines in a list inside the block
and then work with that list. You can process parts of the file immediately
and postpone some processing for later in the program.
The following example stores the lines of pi_digits.txt in a list inside the
with block and then prints the lines outside the with block:
filename = 'pi_digits.txt'
with open(filename) as file_object:
lines = file_object.readlines()
for line in lines:
print(line.rstrip())
The readlines() method takes each line from the file and stores it
in a list. This list is then stored in lines, which we can continue to work with
after the with block ends. Then we use a simple for loop to print each line
from lines. Because each item in lines corresponds to each line in the file,
the output matches the contents of the file exactly.
Working with a File’s Contents
After you’ve read a file into memory, you can do whatever you want with
that data, so let’s briefly explore the digits of pi. First, we’ll attempt to build
a single string containing all the digits in the file with no whitespace in it:
filename = 'pi_30_digits.txt'
with open(filename) as file_object:
lines = file_object.readlines()
pi_string = ' '
for line in lines:
pi_string += line.strip()
print(pi_string)
print(len(pi_string))
Output
3.141592653589793238462643383279
32
We start by opening the file and storing each line of digits in a list, just
as we did in the previous example. Then we create a variable, pi_string, to hold the digits of pi. We then create a loop that adds each line of digits to
pi_string and removes the newline character from each line. At last we
print this string and also show how long the string is.
Writing to a File
One of the simplest ways to save data is to write it to a file. When you write
text to a file, the output will still be available after you close the terminal
containing your program’s output. You can examine output after a program
finishes running, and you can share the output files with others as well. You
can also write programs that read the text back into memory and work with
it again later.
Writing to an Empty File
To write text to a file, you need to call open() with a second argument telling
Python that you want to write to the file. To see how this works, let’s write a
simple message and store it in a file instead of printing it to the screen:
filename = 'programming.txt'
with open(filename, 'w') as file_object:
file_object.write("I love programming.")
The call to open() in this example has two arguments.The first argument is still the name of the file we want to open. The second argument, 'w',
tells Python that we want to open the file in write mode. You can open a file in read mode ('r'), write mode ('w'), append mode ('a'), or a mode that allows
you to read and write to the file ('r+'). If you omit the mode argument,
Python opens the file in read-only mode by default.
The open() function automatically creates the file you’re writing to if it
doesn’t already exist. However, be careful opening a file in write mode ('w')
because if the file does exist, Python will erase the file before returning the
file object.
We use the write() method on the file object to write a string to
the file. This program has no terminal output, but if you open the file
programming.txt, you’ll see one line:
I love programming.
Note:
Python can only write strings to a text file. If you want to store numerical data in a
text file, you’ll have to convert the data to string format first using the str() function.
Writing Multiple Lines
The write() function doesn’t add any newlines to the text you write. So if
you write more than one line without including newline characters, your
file may not look the way you want it to:
filename = 'programming.txt'
with open(filename, 'w') as file_object:
file_object.write("I love programming.")
file_object.write("I love creating new games.")
If you open programming.txt, you’ll see the two lines squished together:
I love programming.I love creating new games.
Here, using "\n" at the end of each line will format it in two different lines. You can also use new line, spaces, tab characters, and blank lines to format your
output, just as you’ve been doing with terminal-based output.
Appending to a File
If you want to add content to a file instead of writing over existing content,
you can open the file in append mode. When you open a file in append mode,
Python doesn’t erase the file before returning the file object. Any lines you
write to the file will be added at the end of the file. If the file doesn’t exist
yet, Python will create an empty file for you.
Let’s modify our previous program by adding some new reasons we love programming to the existing file programming.txt:
filename = 'programming.txt'
with open(filename, 'a') as file_object:
file_object.write("I also love finding meaning in large datasets.\n")
file_object.write("I love creating apps that can run in a browser.\n")
We use the 'a' argument to open the file for appending rather
than writing over the existing file. We wrote two new lines, which are
added to programming.txt:
I love programming.I love creating new games.
I also love finding meaning in large datasets.
I love creating apps that can run in a browser.
We end up with the original contents of the file, followed by the new
content we just added.
Exceptions
Python uses special objects called exceptions to manage errors that arise during a program’s execution. Whenever an error occurs that makes Python
unsure what to do next, it creates an exception object. If you write code
that handles the exception, the program will continue running. If you don’t
handle the exception, the program will halt and show a traceback, which
includes a report of the exception that was raised.
Exceptions are handled with try-except blocks. A try-except block asks
Python to do something, but it also tells Python what to do if an exception is raised. When you use try-except blocks, your programs will continue
running even if things start to go wrong. Instead of tracebacks, which can
be confusing for users to read, users will see friendly error messages that
you write.
Handling the ZeroDivisionError Exception
Let’s look at a simple error that causes Python to raise an exception. You
probably know that it’s impossible to divide a number by zero, but let’s ask
Python to do it anyway:
print(5/0)
Of course Python can’t do this, so we get a traceback:
Traceback (most recent call last):
File "division.py", line 1, in <module>
print(5/0)
ZeroDivisionError: division by zero
The error reported in the traceback, ZeroDivisionError, is an exception object. Python creates this kind of object in response to a situation
where it can’t do what we ask it to. When this happens, Python stops the
program and tells us the kind of exception that was raised. We can use this
information to modify our program. We’ll tell Python what to do when this
kind of exception occurs; that way, if it happens again, we’re prepared.
Using try-except Blocks
When you think an error may occur, you can write a try-except block to
handle the exception that might be raised. You tell Python to try running
some code, and you tell it what to do if the code results in a particular kind
of exception.
Here’s what a try-except block for handling the ZeroDivisionError exception looks like:
try:
print(5/0)
except ZeroDivisionError:
print("You can't divide by zero!")
Output
You can't divide by zero!
We put print(5/0), the line that caused the error, inside a try block. If
the code in a try block works, Python skips over the except block. If the code
in the try block causes an error, Python looks for an except block whose
error matches the one that was raised and runs the code in that block.
In this example, the code in the try block produces a ZeroDivisionError,
so Python looks for an except block telling it how to respond. Python then
runs the code in that block, and the user sees a friendly error message
instead of a traceback.
Using Exceptions to Prevent Crashes
Handling errors correctly is especially important when the program has
more work to do after the error occurs. This happens often in programs
that prompt users for input. If the program responds to invalid input appropriately, it can prompt for more valid input instead of crashing.
Let’s create a simple calculator that does only division:
print("Give me two numbers, and I'll divide them.")
print("Enter 'q' to quit.")
while True:
first_number = input("\nFirst number: ")
if first_number == 'q':
break
second_number = input("Second number: ")
if second_number == 'q':
break
answer = int(first_number) / int(second_number)
print(answer)
This program prompts the user to input a first_number and, if the
user does not enter q to quit, a second_number. We then divide these two
numbers to get an answer. This program does nothing to handle errors,
so asking it to divide by zero causes it to crash:
Output
Give me two numbers, and I'll divide them.
Enter 'q' to quit.
First number: 5
Second number: 0
Traceback (most recent call last):
File "division.py", line 9, in <module>
answer = int(first_number) / int(second_number)
ZeroDivisionError: division by zero
It’s bad that the program crashed, but it’s also not a good idea to let
users see tracebacks. Nontechnical users will be confused by them, and in
a malicious setting, attackers will learn more than you want them to know
from a traceback. For example, they’ll know the name of your program
file, and they’ll see a part of your code that isn’t working properly. A skilled
attacker can sometimes use this information to determine which kind of
attacks to use against your code.
The else Block
We can make this program more error resistant by wrapping the line that
might produce errors in a try-except block. The error occurs on the line
that performs the division, so that’s where we’ll put the try-except block.
This example also includes an else block. Any code that depends on the try
block executing successfully goes in the else block:
print("Give me two numbers, and I'll divide them.")
print("Enter 'q' to quit.")
while True:
first_number = input("\nFirst number: ")
if first_number == 'q':
break
second_number = input("Second number: ")
if second_number == 'q':
break
try:
answer = int(first_number) / int(second_number)
except ZeroDivisionError:
print("You can't divide by 0!")
else:
print(answer)
Output
Give me two numbers, and I'll divide them.
Enter 'q' to quit.
First number: 5
Second number: 0
You can't divide by 0!
First number: 5
Second number: 2
2.5
First number: q
We ask Python to try to complete the division operation in a try
block, which includes only the code that might cause an error. Any
code that depends on the try block succeeding is added to the else block.
In this case if the division operation is successful, we use the else block to
print the result.
The except block tells Python how to respond when a ZeroDivisionError
arises. If the try statement doesn’t succeed because of a division by
zero error, we print a friendly message telling the user how to avoid this
kind of error. The program continues to run, and the user never sees a
traceback.
The try-except-else block works like this: Python attempts to run the
code in the try statement. The only code that should go in a try statement
is code that might cause an exception to be raised. Sometimes you’ll have
additional code that should run only if the try block was successful; this
code goes in the else block. The except block tells Python what to do in case
a certain exception arises when it tries to run the code in the try statement.
By anticipating likely sources of errors, you can write robust programs
that continue to run even when they encounter invalid data and missing
resources. Your code will be resistant to innocent user mistakes and malicious attacks.
Handling the FileNotFoundError Exception
One common issue when working with files is handling missing files. The
file you’re looking for might be in a different location, the filename may
be misspelled, or the file may not exist at all. You can handle all of these
situations in a straightforward way with a try-except block. In the below example we try to open a file that doesn'e exist.
filename = 'alice.txt'
try:
with open(filename) as f_obj:
contents = f_obj.read()
except FileNotFoundError:
msg = "Sorry, the file " + filename + " does not exist."
print(msg)
Output
Sorry, the file alice.txt does not exist.
In this example, the code in the try block produces a FileNotFoundError,
so Python looks for an except block that matches that error. Python then
runs the code in that block, and the result is a friendly error message
instead of a traceback.
Analyzing Text
Let’s pull in the text file and try to count the number
of words in it. We’ll use the string method split(), which can build a
list of words from a string. Create a file 'alice.txt' and write some text into it. Let's count the no of words in 'alice.txt'.
filename = 'alice.txt'
try:
with open(filename) as f_obj:
contents = f_obj.read()
except FileNotFoundError:
msg = "Sorry, the file " + filename + " does not exist."
print(msg)
else:
# Count the approximate number of words in the file.
words = contents.split()
num_words = len(words)
print("The file " + filename + " has about " + str(num_words) + " words.")
Output
The file alice.txt has about 2461 words.
First, we take the string contents and use the split()
method to produce a list of all the words in the book. When we use len() on
this list to examine its length, we get a good approximation of the number
of words in the original string. Then we print a statement that reports
how many words were found in the file. This code is placed in the else block
because it will work only if the code in the try block was executed successfully. The output tells us how many words are in alice.txt.
Working with Multiple Files
Let’s add more text fies to analyze. But before we do, let’s move the bulk of
this program to a function called count_words(). By doing so, it will be easier
to run the analysis for multiple files:
def count_words(filename):
"""Count the approximate number of words in a file."""
try:
with open(filename) as f_obj:
contents = f_obj.read()
except FileNotFoundError:
msg = "Sorry, the file " + filename + " does not exist."
print(msg)
else:
# Count the approximate number of words in the file.
words = contents.split()
num_words = len(words)
print("The file " + filename + " has about " + str(num_words) + " words.")
filenames = ['alice.txt', 'siddhartha.txt', 'moby_dick.txt', 'little_women.txt']
for filename in filenames:
count_words(filename)
Now we can write a simple loop to count the words in any text we want
to analyze. We do this by storing the names of the files we want to analyze
in a list, and then we call count_words() for each file in the list. I’ve intentionally left
siddhartha.txt out of the directory containing word_count.py, so we can see
how well our program handles a missing file. Using the try-except block in this example provides two significant
advantages. We prevent our users from seeing a traceback, and we let the
program continue analyzing the texts it’s able to find. If we don’t catch
the FileNotFoundError that siddhartha.txt raised, the user would see a full
traceback, and the program would stop running after trying to analyze
Siddhartha. It would never analyze Moby Dick or Little Women.
Failing Silently
In the previous example, we informed our users that one of the files
was unavailable. But you don’t need to report every exception you catch.
Sometimes you’ll want the program to fail silently when an exception occurs
and continue on as if nothing happened. To make a program fail silently, you
write a try block as usual, but you explicitly tell Python to do nothing in the
except block. Python has a pass statement that tells it to do nothing in a block:
def count_words(filename):
"""Count the approximate number of words in a file."""
try:
--snip--
except FileNotFoundError:
pass
else:
--snip--
The only difference between this listing and the previous one is the
pass statement. Now when a FileNotFoundError is raised, the code in
the except block runs, but nothing happens. No traceback is produced,
and there’s no output in response to the error that was raised.
Storing Data
Many of your programs will ask users to input certain kinds of information.
You might allow users to store preferences in a game or provide data for a
visualization. Whatever the focus of your program is, you’ll store the information users provide in data structures such as lists and dictionaries. When
users close a program, you’ll almost always want to save the information
they entered. A simple way to do this involves storing your data using the
json module.
The json module allows you to dump simple Python data structures into a
file and load the data from that file the next time the program runs. You can
also use json to share data between different Python programs. Even better,
the JSON data format is not specific to Python, so you can share data you
store in the JSON format with people who work in many other programming
languages.
Using json.dump() and json.load()
Let’s write a short program that stores a set of numbers and another program that reads these numbers back into memory. The first program will
use json.dump() to store the set of numbers, and the second program will use
json.load().
The json.dump() function takes two arguments: a piece of data to
store and a file object it can use to store the data. Here’s how you can use
json.dump() to store a list of numbers:
import json
numbers = [2, 3, 5, 7, 11, 13]
filename = 'numbers.json'
with open(filename, 'w') as f_obj:
json.dump(numbers, f_obj)
We first import the json module and then create a list of numbers to
work with. We then choose a filename in which to store the list of numbers.
It’s customary to use the file extension .json to indicate that the data in
the file is stored in the JSON format. Then we open the file in write mode,
which allows json to write the data to the file. Finally, we use the json.dump()
function to store the list numbers in the file numbers.json.
This program has no output, but let’s open the file numbers.json and
look at it. The data is stored in a format that looks just like Python:
[2, 3, 5, 7, 11, 13]
Now we’ll write a program that uses json.load() to read the list back into
memory:
import json
filename = 'numbers.json'
with open(filename) as f_obj:
numbers = json.load(f_obj)
print(numbers)
Output
[2, 3, 5, 7, 11, 13]
We use the json.load() function to load the
information stored in numbers.json, and we store it in the variable numbers.
Finally we print the recovered list of numbers.
Saving and Reading User-Generated Data
Saving data with json is useful when you’re working with user-generated
data, because if you don’t store your user’s information somehow, you’ll
lose it when the program stops running. Let’s look at an example where we
prompt the user for their name the first time they run a program and then
remember their name when they run the program again.
When someone
runs our script, we want to retrieve their username from memory if
possible; therefore, we’ll start with a try block that attempts to recover the
username. If the file username.json doesn’t exist, we’ll have the except block
prompt for a username and store it in username.json for next time.
import json
# Load the username, if it has been stored previously.
# Otherwise, prompt for the username and store it.
filename = 'username.json'
try:
with open(filename) as f_obj:
username = json.load(f_obj)
except FileNotFoundError:
username = input("What is your name? ")
with open(filename, 'w') as f_obj:
json.dump(username, f_obj)
print("We'll remember you when you come back, " + username + "!")
else:
print("Welcome back, " + username + "!")
Output
What is your name? Eric
We'll remember you when you come back, Eric!
Otherwise
Welcome back, Eric!
Refactoring
Often, you’ll come to a point where your code will work, but you’ll recognize that you could improve the code by breaking it up into a series of functions that have specific jobs. This process is called refactoring. Refactoring
makes your code cleaner, easier to understand, and easier to extend.
Let's refactor above code by moving the bulk of its logic into one
or more functions. We will create three different functions to implement this same logic for more clearer and understandable code.
import json
def get_stored_username():
"""Get stored username if available."""
filename = 'username.json'
try:
with open(filename) as f_obj:
username = json.load(f_obj)
except FileNotFoundError:
return None
else:
return username
def get_new_username():
"""Prompt for a new username."""
username = input("What is your name? ")
filename = 'username.json'
with open(filename, 'w') as f_obj:
json.dump(username, f_obj)
return username
def greet_user():
"""Greet the user by name."""
username = get_stored_username()
if username:
print("Welcome back, " + username + "!")
else:
username = get_new_username()
print("We'll remember you when you come back, " + username + "!")
greet_user()
Each function in this final version has a single, clear
purpose. We call greet_user(), and that function prints an appropriate message: it either welcomes back an existing user or greets a new user. It does
this by calling get_stored_username(), which is responsible only for retrieving
a stored username if one exists. Finally, greet_user() calls get_new_username()
if necessary, which is responsible only for getting a new username and storing it. This compartmentalization of work is an essential part of writing
clear code that will be easy to maintain and extend.
Conclusion
In this blog, you learned how to work with files. You learned to read an
entire file at once and read through a file’s contents one line at a time. You
learned to write to a file and append text onto the end of a file. You read
about exceptions and how to handle the exceptions you’re likely to see in
your programs. Finally, you learned how to store Python data structures so
you can save information your users provide, preventing them from having
to start over each time they run a program.
This is the final article of the Learn Python series. There will be a practice questions along with solved code uploaded soon so that you can test your skills and build on top of it.
Happy coding!!
Post a Comment
Let's make it better!
Comment your thoughts...