CODE WITH MARTIN
10 min read -

5. Strings


String Variables

So far we've worked with numbers, so let's look at how we work with text. A 'string' is another variable type that holds text.

The following code shows how to define a string variable:

greeting = "Hello"
print(greeting)

The variable named 'greeting' is defined and assigned the text that is inside the quotation marks on line 1. The quotation marks around the text "Hello" represents a string. We then use this variable in the print statement on line 2 which displays the text "Hello" in the terminal when you run the code.

Working with strings in any programming language is fundamental to creating useful software. Consider an appliction form on a web-site that allows you to register. You would enter your name and probably an email address. Both of these inputs are strings. You then want to check that the username doesn't contain any spaces and is less than 20 characters in length. You will also want to check that the email address looks like a real email address. Both of these checks require us knowing how to work with strings.

We're going to learn how to work with strings in a variety of ways.

First up, lets see how we can join strings together using the '+' operator:

greeting = "Hello"
name = "Martin"
message = greeting + " " + name

print(message)

Lines 1 and 2 show how we define two string variables. Line 3 also defines a new variable named 'message' that is assigned the result of 3 strings being added together. You'll notice how we use a space character in between the greeting variable and the name variable. This is so that when we print the 'message' variable, we see 'Hello Martin', and not 'HelloMartin'.

Multiline Strings

We can create multiline strings that include new lines just as the new lines appear in the code file. Example:

msg = """Hi,

Im writing to you let you know that you're awesome.

Regards, Martin."""

print(msg)

Notice here how we use three quotation marks before and after the text to tell Python that we want to create a string using new lines. Python considers all the text between the starting triple quotes on line 1, and the ending triple quotes on line 5. This whole block of text is a single string that is assigned to the variable 'msg'.

There is a variation of this multiline string using three apostrophes characters instead of quotation marks. There is no difference at all to the value of the string:

msg = '''Hi,

Im writing to you let you know that you're awesome.

Regards, Martin.'''

print(msg)
It is preferred that you use the triple quotation marks style over the apostrophe style.

Escape Sequences

When defining a string, you might wonder what would happen if you really need quotes in your string. You might write something like this:

msg = "To activate Siri on your phone, say "Hey Siri"."
print(msg)

If you run this code, you will see a message in your terminal that says "Syntax Error: invalid syntax". This is a Python error and it's telling you that Python doesn't understand the statement on line 1 in your code. This is because the quotes in your string are indicating an end to the string when it reads the second quotation mark and the following "Hey Siri" words do not mean anything to Python.

To fix this problem, we use 'escape sequences' to indicate to Python that we really want a quotation mark in our string. Here is what the escape sequence looks like:

msg = "To activate Siri on your phone, say \"Hey Siri\"."
print(msg)

All we have done is put a backslash '\' character before the quotation marks inside the string. This tells Python to excuse the use of the quotation mark and consider it as part of the string text. We call the \" characters an 'escape sequence'.

There are a few more useful escape sequences to be aware of. They are...

Tabs (escape sequence \t):

msg = "Hello\tWorld"
print(msg)

New Lines (escape sequence \n):

msg = "Hello\nWorld"
print(msg)

Backslash (escape sequence \\):

msg = "You can use \\n to add a new line to a string."
print(msg)

Notice how using double backslash allows us to use the actual text \n in the string (run it to see). Without the double backslash, Python would think we wanted to add a new line.

Formatting

A very useful feature of strings is the ability to use placeholders in the string which allows us to replace those placeholders with variable values. This is a really handy feature when you consider a real world scenario that might happen in a marketing company.

Imagine you wanted to write some software that allows a team of marketers to send an email to a large list of clients detailing a new product offer. The marketing team make new offers every day. We could ask the marketing team to build a string containing whatever text they wanted as long as they include the placeholders. Our software could insert customer names and things like todays date using those placeholders.

Let's take a look at the many ways placeholders help us build strings. The following example shows how to insert an integer variable in to a string:

price = 99
message = "The price of the item is {} pounds."
output = message.format(price)

print(output)

On line 2, we define a string variable named 'message'. Inside the string we have the curly brackets '{}' - this is our placeholder. On line 3, we are defining another variable named 'output', that is assigned the result of the Python statement 'message.format(price)'.

When we write the text '.format' after a string variable, it tells Python that we want to begin working with the placeholders inside that string variable and begin replacing them with variables. We provide the variables we want to use in the replace by writing the variable names inside the round brackets '()'. As you can see, we have written the name of the 'price' variable in between the brackets. With all of this, Python goes ahead and processes the string and does the replacing. The end result of this is then assigned to the variable 'output'. Run the program and see!

What we just did in proper programming terms, is execute a 'function'. The function we executed is called 'format'. We will get to learn all about functions in the next section as that is an entire topic on its own.

We can use multiple placeholders and use multiple variables in the 'format' function. Here's an example:

item = "TV Stand"
price = 99

message = "The price of the item {} is {} pounds."
output = message.format(item, price)

print(output)

On line 5, you'll see how we provide the 'item' and 'price' variables in the brackets. They are separated by a comma. You can supply any amount of variables as long as they are separated by a comma. The format function uses the first variable for the first placeholder it finds in the string, and the second variable in the brackets for the second placeholder in the string.

Another way of writing placeholders is using index numbers. The index numbers refer to the position of the variables provided to the format function. Let's see what that looks like:

item = "TV Stand"
price = 99
            
message = "The price of the item {0} is {1} pounds. The item {0} is currently on offer."
output = message.format(item, price)
            
print(output)

The placeholder '{0}' in the string on line 4, refers to the first variable in the brackets after the 'format' text on line 5, and the '{1}' placeholder refers to the second variable. By using index placeholders, we can reuse the variables multiple times in the string, which we do by using '{0}' twice in the string on line 4.

There is one more formatting style that offers even more control of your placeholders and variables. This is called 'named placeholders'. Here's an example:

item = "TV Stand"
price = 99
            
message = "The price of the item {name} is {cost} pounds. The item {name} is currently on offer."
output = message.format(name = item, cost = price)
            
print(output)

Instead of using an index number to refer to variables in the placeholders, we can use an actual name. This name can be anything you want. The magic happens in the brackets on line 5. You'll see how we assign our variables 'item' and 'price' to the name of the placeholders 'name' and 'cost'. If you use a lot of placeholders in a large string, using named placeholders can be much easier to read and manage.

More String Functions

There are lots of other functions we can use with strings. Let's take a quick tour through some of the most commonly used ones:

Get the length of a string:

name = "Martin"
length = len(name)
            
print(length)

Convert string to upper case:

name = "Martin"
name = name.upper()
            
print(name)

Remove spaces from the start and end:

name = "   Martin   "
name = name.strip()
            
print(name)

Remove spaces from the start only:

name = "   Martin   "
name = name.lstrip()
            
print(name)

Remove spaces from the end only:

name = "   Martin   "
name = name.rstrip()
            
print(name)

Replace text in a string:

msg = "I like oranges."
print(msg)

msg = msg.replace("oranges", "turtles")           
print(msg)

Converting Variable Types

A very common thing we need to do sometimes is convert numbers to strings, or strings to numbers. For example, you might have loaded some data from somewhere like a file, or a from a web-site, and the number you need to work with for calculations is inside a string variable.

Python has 3 helpful functions for converting things to integers, floats, and strings. Here's an example of them all:

num_in_string = "34"
float_in_string = "25.9"
some_num = 10
some_float = 25.8

result1 = int(num_in_string)
result2 = float(float_in_string)
result3 = str(some_float)

You can see on lines 6, 7, and 8, how we use the functions 'int', 'float' and 'str'. Each of them convert whatever variable you use, to the desired named type. So 'int(some_variable)' will convert any float or string in to an integer variable type.

What We Learned