Watch Now This tutorial has a related video course created by the Existent Python team. Lookout information technology together with the written tutorial to deepen your agreement: Strings and Character Data in Python

In the tutorial on Basic Information Types in Python, you learned how to define strings: objects that contain sequences of graphic symbol data. Processing character data is integral to programming. It is a rare application that doesn't need to manipulate strings at least to some extent.

Hither's what you'll learn in this tutorial: Python provides a rich set up of operators, functions, and methods for working with strings. When you are finished with this tutorial, y'all will know how to access and extract portions of strings, and also be familiar with the methods that are available to manipulate and modify string information.

You lot will likewise be introduced to two other Python objects used to correspond raw byte data, the bytes and bytearray types.

String Manipulation

The sections beneath highlight the operators, methods, and functions that are bachelor for working with strings.

String Operators

You take already seen the operators + and * applied to numeric operands in the tutorial on Operators and Expressions in Python. These ii operators tin can be applied to strings too.

The + Operator

The + operator concatenates strings. It returns a string consisting of the operands joined together, every bit shown here:

>>>

                                                        >>>                                        s                    =                    'foo'                    >>>                                        t                    =                    'bar'                    >>>                                        u                    =                    'baz'                    >>>                                        s                    +                    t                    'foobar'                    >>>                                        s                    +                    t                    +                    u                    'foobarbaz'                    >>>                                        print                    (                    'Get team'                    +                    '!!!'                    )                    Go team!!!                                  

The * Operator

The * operator creates multiple copies of a string. If south is a string and n is an integer, either of the following expressions returns a string consisting of northward concatenated copies of s:

s * n
n * due south

Here are examples of both forms:

>>>

                                                        >>>                                        s                    =                    'foo.'                    >>>                                        s                    *                    4                    'foo.foo.foo.foo.'                    >>>                                        4                    *                    s                    'foo.foo.foo.foo.'                                  

The multiplier operand northward must exist an integer. Y'all'd recall it would be required to be a positive integer, but amusingly, information technology can be nothing or negative, in which case the upshot is an empty string:

If you were to create a string variable and initialize it to the empty cord past assigning it the value 'foo' * -8, anyone would rightly think you lot were a bit daft. But it would piece of work.

The in Operator

Python as well provides a membership operator that can be used with strings. The in operator returns True if the starting time operand is independent within the second, and Faux otherwise:

>>>

                                                        >>>                                        s                    =                    'foo'                    >>>                                        south                    in                    'That                    \'                    s food for thought.'                    True                    >>>                                        s                    in                    'That                    \'                    s expert for now.'                    Imitation                                  

There is also a non in operator, which does the contrary:

>>>

                                                        >>>                                        'z'                    not                    in                    'abc'                    True                    >>>                                        'z'                    non                    in                    'xyz'                    False                                  

Built-in String Functions

Equally you saw in the tutorial on Basic Data Types in Python, Python provides many functions that are congenital-in to the interpreter and always bachelor. Here are a few that work with strings:

Function Description
chr() Converts an integer to a character
ord() Converts a character to an integer
len() Returns the length of a string
str() Returns a string representation of an object

These are explored more fully below.

ord(c)

Returns an integer value for the given grapheme.

At the virtually basic level, computers shop all information every bit numbers. To represent character information, a translation scheme is used which maps each character to its representative number.

The simplest scheme in common use is called ASCII. It covers the common Latin characters you are probably nigh accepted to working with. For these characters, ord(c) returns the ASCII value for character c:

>>>

                                                        >>>                                        ord                    (                    'a'                    )                    97                    >>>                                        ord                    (                    '#'                    )                    35                                  

ASCII is fine as far as it goes. Simply in that location are many different languages in use in the world and countless symbols and glyphs that appear in digital media. The full ready of characters that potentially may need to be represented in estimator lawmaking far surpasses the ordinary Latin letters, numbers, and symbols yous usually see.

Unicode is an aggressive standard that attempts to provide a numeric code for every possible character, in every possible linguistic communication, on every possible platform. Python 3 supports Unicode extensively, including allowing Unicode characters within strings.

As long equally you stay in the domain of the common characters, at that place is trivial practical difference betwixt ASCII and Unicode. But the ord() function will render numeric values for Unicode characters as well:

>>>

                                                        >>>                                        ord                    (                    '€'                    )                    8364                    >>>                                        ord                    (                    '∑'                    )                    8721                                  

chr(n)

Returns a graphic symbol value for the given integer.

chr() does the reverse of ord(). Given a numeric value n, chr(northward) returns a cord representing the character that corresponds to north:

>>>

                                                        >>>                                        chr                    (                    97                    )                    'a'                    >>>                                        chr                    (                    35                    )                    '#'                                  

chr() handles Unicode characters equally well:

>>>

                                                        >>>                                        chr                    (                    8364                    )                    '€'                    >>>                                        chr                    (                    8721                    )                    '∑'                                  

len(southward)

Returns the length of a cord.

With len(), you tin can check Python cord length. len(south) returns the number of characters in southward:

>>>

                                                        >>>                                        southward                    =                    'I am a string.'                    >>>                                        len                    (                    southward                    )                    14                                  

str(obj)

Returns a string representation of an object.

Nearly any object in Python can be rendered equally a string. str(obj) returns the string representation of object obj:

>>>

                                                        >>>                                        str                    (                    49.2                    )                    '49.2'                    >>>                                        str                    (                    three                    +                    4                    j                    )                    '(3+4j)'                    >>>                                        str                    (                    three                    +                    29                    )                    '32'                    >>>                                        str                    (                    'foo'                    )                    'foo'                                  

String Indexing

Often in programming languages, individual items in an ordered set of data can exist accessed directly using a numeric index or key value. This process is referred to as indexing.

In Python, strings are ordered sequences of character information, and thus tin can be indexed in this mode. Individual characters in a cord can be accessed by specifying the string name followed by a number in foursquare brackets ([]).

String indexing in Python is zero-based: the start character in the string has index 0, the next has index 1, then on. The index of the last character volition be the length of the string minus i.

For example, a schematic diagram of the indices of the string 'foobar' would expect like this:

String index 1
String Indices

The private characters can be accessed by index as follows:

>>>

                                                  >>>                                    south                  =                  'foobar'                  >>>                                    s                  [                  0                  ]                  'f'                  >>>                                    s                  [                  1                  ]                  'o'                  >>>                                    s                  [                  iii                  ]                  'b'                  >>>                                    len                  (                  due south                  )                  6                  >>>                                    s                  [                  len                  (                  s                  )                  -                  1                  ]                  'r'                              

Attempting to alphabetize beyond the finish of the cord results in an error:

>>>

                                                  >>>                                    s                  [                  half dozen                  ]                  Traceback (virtually recent telephone call last):                  File                  "<pyshell#17>", line                  1, in                  <module>                  south                  [                  6                  ]                  IndexError:                  string index out of range                              

Cord indices can also be specified with negative numbers, in which case indexing occurs from the end of the cord backward: -1 refers to the last grapheme, -2 the 2d-to-last graphic symbol, and and then on. Here is the same diagram showing both the positive and negative indices into the string 'foobar':

String index 2
Positive and Negative String Indices

Here are some examples of negative indexing:

>>>

                                                  >>>                                    due south                  =                  'foobar'                  >>>                                    due south                  [                  -                  1                  ]                  'r'                  >>>                                    due south                  [                  -                  2                  ]                  'a'                  >>>                                    len                  (                  south                  )                  6                  >>>                                    s                  [                  -                  len                  (                  s                  )]                  'f'                              

Attempting to alphabetize with negative numbers across the start of the cord results in an mistake:

>>>

                                                  >>>                                    s                  [                  -                  seven                  ]                  Traceback (near contempo call last):                  File                  "<pyshell#26>", line                  1, in                  <module>                  southward                  [                  -                  7                  ]                  IndexError:                  string index out of range                              

For any non-empty cord s, s[len(southward)-1] and s[-i] both return the final character. There isn't any alphabetize that makes sense for an empty cord.

String Slicing

Python also allows a form of indexing syntax that extracts substrings from a string, known as string slicing. If s is a cord, an expression of the form s[g:n] returns the portion of due south starting with position thou, and upward to but non including position n:

>>>

                                                  >>>                                    southward                  =                  'foobar'                  >>>                                    due south                  [                  2                  :                  5                  ]                  'oba'                              

Once more, the 2d index specifies the first character that is not included in the result—the character 'r' (south[v]) in the example above. That may seem slightly unintuitive, but it produces this outcome which makes sense: the expression s[m:n] volition render a substring that is n - m characters in length, in this case, 5 - 2 = 3.

If you omit the first index, the slice starts at the offset of the string. Thus, s[:m] and s[0:m] are equivalent:

>>>

                                                  >>>                                    south                  =                  'foobar'                  >>>                                    s                  [:                  four                  ]                  'foob'                  >>>                                    s                  [                  0                  :                  4                  ]                  'foob'                              

Similarly, if you omit the second index every bit in s[due north:], the slice extends from the first index through the end of the cord. This is a nice, concise alternative to the more cumbersome due south[n:len(s)]:

>>>

                                                  >>>                                    southward                  =                  'foobar'                  >>>                                    s                  [                  2                  :]                  'obar'                  >>>                                    south                  [                  2                  :                  len                  (                  due south                  )]                  'obar'                              

For any cord south and any integer northward (0 ≤ n ≤ len(s)), s[:north] + s[n:] volition be equal to south:

>>>

                                                  >>>                                    s                  =                  'foobar'                  >>>                                    southward                  [:                  4                  ]                  +                  s                  [                  four                  :]                  'foobar'                  >>>                                    s                  [:                  4                  ]                  +                  s                  [                  4                  :]                  ==                  s                  True                              

Omitting both indices returns the original string, in its entirety. Literally. It's not a copy, it's a reference to the original cord:

>>>

                                                  >>>                                    s                  =                  'foobar'                  >>>                                    t                  =                  s                  [:]                  >>>                                    id                  (                  s                  )                  59598496                  >>>                                    id                  (                  t                  )                  59598496                  >>>                                    due south                  is                  t                  Truthful                              

If the commencement index in a slice is greater than or equal to the second alphabetize, Python returns an empty cord. This is withal another obfuscated style to generate an empty cord, in case you were looking for one:

>>>

                                                  >>>                                    s                  [                  2                  :                  2                  ]                  ''                  >>>                                    s                  [                  4                  :                  2                  ]                  ''                              

Negative indices tin exist used with slicing as well. -1 refers to the last character, -2 the second-to-last, and and then on, just every bit with unproblematic indexing. The diagram below shows how to piece the substring 'oob' from the string 'foobar' using both positive and negative indices:

String index 3
String Slicing with Positive and Negative Indices

Here is the corresponding Python code:

>>>

                                                  >>>                                    s                  =                  'foobar'                  >>>                                    due south                  [                  -                  5                  :                  -                  2                  ]                  'oob'                  >>>                                    s                  [                  1                  :                  4                  ]                  'oob'                  >>>                                    s                  [                  -                  v                  :                  -                  ii                  ]                  ==                  s                  [                  i                  :                  four                  ]                  Truthful                              

Specifying a Step in a String Slice

There is ane more variant of the slicing syntax to hash out. Adding an additional : and a third index designates a footstep (besides called a pace), which indicates how many characters to jump after retrieving each character in the slice.

For example, for the cord 'foobar', the piece 0:6:ii starts with the showtime character and ends with the terminal character (the whole string), and every 2d grapheme is skipped. This is shown in the post-obit diagram:

String stride 1
String Indexing with Stride

Similarly, i:6:2 specifies a slice starting with the second character (index 1) and ending with the last character, and once again the stride value ii causes every other graphic symbol to be skipped:

String stride 2
Some other Cord Indexing with Footstep

The illustrative REPL code is shown hither:

>>>

                                                  >>>                                    southward                  =                  'foobar'                  >>>                                    southward                  [                  0                  :                  six                  :                  two                  ]                  'foa'                  >>>                                    s                  [                  1                  :                  6                  :                  two                  ]                  'obr'                              

As with whatsoever slicing, the commencement and 2d indices can be omitted, and default to the first and last characters respectively:

>>>

                                                  >>>                                    s                  =                  '12345'                  *                  5                  >>>                                    due south                  '1234512345123451234512345'                  >>>                                    s                  [::                  5                  ]                  '11111'                  >>>                                    s                  [                  iv                  ::                  5                  ]                  '55555'                              

Yous tin can specify a negative footstep value likewise, in which case Python steps astern through the string. In that instance, the starting/showtime index should be greater than the ending/2d index:

>>>

                                                  >>>                                    due south                  =                  'foobar'                  >>>                                    s                  [                  5                  :                  0                  :                  -                  2                  ]                  'rbo'                              

In the above example, 5:0:-2 means "kickoff at the final character and step astern by ii, upward to but non including the first grapheme."

When you lot are stepping astern, if the first and second indices are omitted, the defaults are reversed in an intuitive mode: the first alphabetize defaults to the end of the cord, and the second alphabetize defaults to the start. Here is an case:

>>>

                                                  >>>                                    s                  =                  '12345'                  *                  v                  >>>                                    south                  '1234512345123451234512345'                  >>>                                    s                  [::                  -                  5                  ]                  '55555'                              

This is a mutual epitome for reversing a string:

>>>

                                                  >>>                                    s                  =                  'If Comrade Napoleon says information technology, information technology must be right.'                  >>>                                    southward                  [::                  -                  i                  ]                  '.thgir eb tsum ti ,ti syas noelopaN edarmoC fI'                              

Interpolating Variables Into a String

In Python version three.half-dozen, a new string formatting mechanism was introduced. This feature is formally named the Formatted Cord Literal, but is more normally referred to by its nickname f-string.

The formatting adequacy provided by f-strings is extensive and won't be covered in full particular here. If you want to larn more, you tin bank check out the Real Python article Python 3'south f-Strings: An Improved String Formatting Syntax (Guide). There is besides a tutorial on Formatted Output coming up afterward in this serial that digs deeper into f-strings.

One simple feature of f-strings y'all tin start using correct away is variable interpolation. You can specify a variable name directly inside an f-string literal, and Python will supplant the name with the corresponding value.

For example, suppose you lot want to display the result of an arithmetic calculation. You can do this with a straightforward impress() statement, separating numeric values and cord literals by commas:

>>>

                                                  >>>                                    n                  =                  20                  >>>                                    thou                  =                  25                  >>>                                    prod                  =                  n                  *                  thou                  >>>                                    print                  (                  'The product of'                  ,                  due north                  ,                  'and'                  ,                  yard                  ,                  'is'                  ,                  prod                  )                  The product of 20 and 25 is 500                              

But this is cumbersome. To reach the aforementioned thing using an f-string:

  • Specify either a lowercase f or uppercase F direct before the opening quote of the string literal. This tells Python it is an f-string instead of a standard string.
  • Specify any variables to be interpolated in curly braces ({}).

Recast using an f-cord, the above case looks much cleaner:

>>>

                                                  >>>                                    northward                  =                  20                  >>>                                    1000                  =                  25                  >>>                                    prod                  =                  n                  *                  m                  >>>                                    impress                  (                  f                  'The product of                                    {                  northward                  }                                      and                                    {                  m                  }                                      is                                    {                  prod                  }                  '                  )                  The product of 20 and 25 is 500                              

Any of Python's 3 quoting mechanisms can exist used to define an f-string:

>>>

                                                  >>>                                    var                  =                  'Bark'                  >>>                                    print                  (                  f                  'A dog says                                    {                  var                  }                  !'                  )                  A dog says Bark!                  >>>                                    print                  (                  f                  "A canis familiaris says                                    {                  var                  }                  !"                  )                  A canis familiaris says Bark!                  >>>                                    print                  (                  f                  '''A canis familiaris says                                    {                  var                  }                  !'''                  )                  A canis familiaris says Bark!                              

Modifying Strings

In a nutshell, yous can't. Strings are 1 of the data types Python considers immutable, meaning non able to be changed. In fact, all the data types yous have seen so far are immutable. (Python does provide data types that are mutable, equally you will soon encounter.)

A statement like this will cause an mistake:

>>>

                                                  >>>                                    southward                  =                  'foobar'                  >>>                                    s                  [                  3                  ]                  =                  'x'                  Traceback (nigh recent call last):                  File                  "<pyshell#xl>", line                  1, in                  <module>                  s                  [                  3                  ]                  =                  'x'                  TypeError:                  'str' object does not back up item consignment                              

In truth, there really isn't much need to alter strings. You tin usually easily attain what you want by generating a copy of the original string that has the desired change in place. At that place are very many ways to do this in Python. Hither is one possibility:

>>>

                                                  >>>                                    s                  =                  southward                  [:                  3                  ]                  +                  'ten'                  +                  due south                  [                  iv                  :]                  >>>                                    s                  'fooxar'                              

There is as well a congenital-in string method to accomplish this:

>>>

                                                  >>>                                    s                  =                  'foobar'                  >>>                                    due south                  =                  s                  .                  replace                  (                  'b'                  ,                  'x'                  )                  >>>                                    s                  'fooxar'                              

Read on for more data most congenital-in string methods!

Born String Methods

Yous learned in the tutorial on Variables in Python that Python is a highly object-oriented language. Every item of information in a Python program is an object.

You are also familiar with functions: callable procedures that you can invoke to perform specific tasks.

Methods are similar to functions. A method is a specialized blazon of callable procedure that is tightly associated with an object. Similar a function, a method is called to perform a singled-out task, but information technology is invoked on a specific object and has noesis of its target object during execution.

The syntax for invoking a method on an object is as follows:

This invokes method .foo() on object obj. <args> specifies the arguments passed to the method (if any).

You will explore much more about defining and calling methods later on in the discussion of object-oriented programming. For now, the goal is to present some of the more ordinarily used congenital-in methods Python supports for operating on string objects.

In the following method definitions, arguments specified in foursquare brackets ([]) are optional.

Case Conversion

Methods in this group perform case conversion on the target string.

due south.capitalize()

Capitalizes the target cord.

s.capitalize() returns a re-create of s with the first character converted to uppercase and all other characters converted to lowercase:

>>>

                                                        >>>                                        s                    =                    'foO BaR BAZ quX'                    >>>                                        s                    .                    capitalize                    ()                    'Foo bar baz qux'                                  

Non-alphabetic characters are unchanged:

>>>

                                                        >>>                                        s                    =                    'foo123#BAR#.'                    >>>                                        s                    .                    capitalize                    ()                    'Foo123#bar#.'                                  

s.lower()

Converts alphabetic characters to lowercase.

south.lower() returns a copy of s with all alphabetic characters converted to lowercase:

>>>

                                                        >>>                                        'FOO Bar 123 baz qUX'                    .                    lower                    ()                    'foo bar 123 baz qux'                                  

s.swapcase()

Swaps case of alphabetic characters.

southward.swapcase() returns a re-create of south with uppercase alphabetic characters converted to lowercase and vice versa:

>>>

                                                        >>>                                        'FOO Bar 123 baz qUX'                    .                    swapcase                    ()                    'foo bAR 123 BAZ Qux'                                  

s.title()

Converts the target cord to "championship instance."

s.title() returns a copy of s in which the start letter of each word is converted to capital letter and remaining letters are lowercase:

>>>

                                                        >>>                                        'the sunday also rises'                    .                    championship                    ()                    'The Lord's day Also Rises'                                  

This method uses a adequately simple algorithm. It does non attempt to distinguish between important and unimportant words, and it does not handle apostrophes, possessives, or acronyms gracefully:

>>>

                                                        >>>                                        "what'south happened to ted's IBM stock?"                    .                    title                    ()                    "What'Due south Happened To Ted'S Ibm Stock?"                                  

due south.upper()

Converts alphabetic characters to uppercase.

s.upper() returns a copy of s with all alphabetic characters converted to upper-case letter:

>>>

                                                        >>>                                        'FOO Bar 123 baz qUX'                    .                    upper                    ()                    'FOO BAR 123 BAZ QUX'                                  

Detect and Replace

These methods provide various means of searching the target string for a specified substring.

Each method in this group supports optional <beginning> and <end> arguments. These are interpreted every bit for cord slicing: the activeness of the method is restricted to the portion of the target string starting at character position <start> and proceeding upwardly to only not including character position <end>. If <start> is specified but <stop> is not, the method applies to the portion of the target string from <offset> through the cease of the string.

s.count(<sub>[, <start>[, <stop>]])

Counts occurrences of a substring in the target cord.

s.count(<sub>) returns the number of non-overlapping occurrences of substring <sub> in s:

>>>

                                                        >>>                                        'foo goo moo'                    .                    count                    (                    'oo'                    )                    3                                  

The count is restricted to the number of occurrences within the substring indicated by <start> and <terminate>, if they are specified:

>>>

                                                        >>>                                        'foo goo moo'                    .                    count                    (                    'oo'                    ,                    0                    ,                    8                    )                    2                                  

s.endswith(<suffix>[, <showtime>[, <stop>]])

Determines whether the target string ends with a given substring.

s.endswith(<suffix>) returns True if south ends with the specified <suffix> and Fake otherwise:

>>>

                                                        >>>                                        'foobar'                    .                    endswith                    (                    'bar'                    )                    Truthful                    >>>                                        'foobar'                    .                    endswith                    (                    'baz'                    )                    Imitation                                  

The comparison is restricted to the substring indicated by <start> and <end>, if they are specified:

>>>

                                                        >>>                                        'foobar'                    .                    endswith                    (                    'oob'                    ,                    0                    ,                    4                    )                    True                    >>>                                        'foobar'                    .                    endswith                    (                    'oob'                    ,                    2                    ,                    4                    )                    False                                  

s.find(<sub>[, <beginning>[, <end>]])

Searches the target cord for a given substring.

Y'all can use .find() to see if a Python string contains a detail substring. s.find(<sub>) returns the lowest alphabetize in due south where substring <sub> is found:

>>>

                                                        >>>                                        'foo bar foo baz foo qux'                    .                    notice                    (                    'foo'                    )                    0                                  

This method returns -1 if the specified substring is not found:

>>>

                                                        >>>                                        'foo bar foo baz foo qux'                    .                    find                    (                    'grault'                    )                    -1                                  

The search is restricted to the substring indicated past <showtime> and <end>, if they are specified:

>>>

                                                        >>>                                        'foo bar foo baz foo qux'                    .                    detect                    (                    'foo'                    ,                    four                    )                    8                    >>>                                        'foo bar foo baz foo qux'                    .                    detect                    (                    'foo'                    ,                    4                    ,                    7                    )                    -1                                  

southward.index(<sub>[, <first>[, <stop>]])

Searches the target string for a given substring.

This method is identical to .find(), except that information technology raises an exception if <sub> is not establish rather than returning -ane:

>>>

                                                        >>>                                        'foo bar foo baz foo qux'                    .                    index                    (                    'grault'                    )                    Traceback (most recent call concluding):                    File                    "<pyshell#0>", line                    i, in                    <module>                    'foo bar foo baz foo qux'                    .                    alphabetize                    (                    'grault'                    )                    ValueError:                    substring non institute                                  

s.rfind(<sub>[, <start>[, <cease>]])

Searches the target cord for a given substring starting at the end.

s.rfind(<sub>) returns the highest index in s where substring <sub> is found:

>>>

                                                        >>>                                        'foo bar foo baz foo qux'                    .                    rfind                    (                    'foo'                    )                    sixteen                                  

As with .discover(), if the substring is not found, -ane is returned:

>>>

                                                        >>>                                        'foo bar foo baz foo qux'                    .                    rfind                    (                    'grault'                    )                    -1                                  

The search is restricted to the substring indicated by <starting time> and <cease>, if they are specified:

>>>

                                                        >>>                                        'foo bar foo baz foo qux'                    .                    rfind                    (                    'foo'                    ,                    0                    ,                    14                    )                    8                    >>>                                        'foo bar foo baz foo qux'                    .                    rfind                    (                    'foo'                    ,                    10                    ,                    14                    )                    -1                                  

due south.rindex(<sub>[, <starting time>[, <finish>]])

Searches the target string for a given substring starting at the end.

This method is identical to .rfind(), except that it raises an exception if <sub> is non found rather than returning -1:

>>>

                                                        >>>                                        'foo bar foo baz foo qux'                    .                    rindex                    (                    'grault'                    )                    Traceback (most recent call last):                    File                    "<pyshell#one>", line                    1, in                    <module>                    'foo bar foo baz foo qux'                    .                    rindex                    (                    'grault'                    )                    ValueError:                    substring not found                                  

south.startswith(<prefix>[, <first>[, <end>]])

Determines whether the target string starts with a given substring.

When yous use the Python .startswith() method, due south.startswith(<suffix>) returns Truthful if southward starts with the specified <suffix> and False otherwise:

>>>

                                                        >>>                                        'foobar'                    .                    startswith                    (                    'foo'                    )                    True                    >>>                                        'foobar'                    .                    startswith                    (                    'bar'                    )                    Imitation                                  

The comparing is restricted to the substring indicated by <starting time> and <terminate>, if they are specified:

>>>

                                                        >>>                                        'foobar'                    .                    startswith                    (                    'bar'                    ,                    iii                    )                    True                    >>>                                        'foobar'                    .                    startswith                    (                    'bar'                    ,                    3                    ,                    2                    )                    Fake                                  

Character Classification

Methods in this group classify a string based on the characters it contains.

s.isalnum()

Determines whether the target cord consists of alphanumeric characters.

s.isalnum() returns Truthful if southward is nonempty and all its characters are alphanumeric (either a letter or a number), and Simulated otherwise:

>>>

                                                        >>>                                        'abc123'                    .                    isalnum                    ()                    Truthful                    >>>                                        'abc$123'                    .                    isalnum                    ()                    Imitation                    >>>                                        ''                    .                    isalnum                    ()                    Fake                                  

s.isalpha()

Determines whether the target string consists of alphabetic characters.

s.isalpha() returns True if s is nonempty and all its characters are alphabetic, and False otherwise:

>>>

                                                        >>>                                        'ABCabc'                    .                    isalpha                    ()                    True                    >>>                                        'abc123'                    .                    isalpha                    ()                    Faux                                  

s.isdigit()

Determines whether the target string consists of digit characters.

You can use the .isdigit() Python method to check if your string is made of merely digits. s.isdigit() returns True if s is nonempty and all its characters are numeric digits, and False otherwise:

>>>

                                                        >>>                                        '123'                    .                    isdigit                    ()                    True                    >>>                                        '123abc'                    .                    isdigit                    ()                    False                                  

s.isidentifier()

Determines whether the target string is a valid Python identifier.

s.isidentifier() returns Truthful if s is a valid Python identifier according to the linguistic communication definition, and Imitation otherwise:

>>>

                                                        >>>                                        'foo32'                    .                    isidentifier                    ()                    True                    >>>                                        '32foo'                    .                    isidentifier                    ()                    Simulated                    >>>                                        'foo$32'                    .                    isidentifier                    ()                    Fake                                  

s.islower()

Determines whether the target string's alphabetic characters are lowercase.

s.islower() returns Truthful if southward is nonempty and all the alphabetic characters it contains are lowercase, and False otherwise. Not-alphabetic characters are ignored:

>>>

                                                        >>>                                        'abc'                    .                    islower                    ()                    True                    >>>                                        'abc1$d'                    .                    islower                    ()                    True                    >>>                                        'Abc1$D'                    .                    islower                    ()                    False                                  

due south.isprintable()

Determines whether the target cord consists entirely of printable characters.

s.isprintable() returns Truthful if southward is empty or all the alphabetic characters it contains are printable. It returns False if s contains at least ane non-printable character. Non-alphabetic characters are ignored:

>>>

                                                        >>>                                        'a                    \t                    b'                    .                    isprintable                    ()                    False                    >>>                                        'a b'                    .                    isprintable                    ()                    True                    >>>                                        ''                    .                    isprintable                    ()                    True                    >>>                                        'a                    \n                    b'                    .                    isprintable                    ()                    False                                  

south.isspace()

Determines whether the target string consists of whitespace characters.

s.isspace() returns True if s is nonempty and all characters are whitespace characters, and Fake otherwise.

The most commonly encountered whitespace characters are space ' ', tab '\t', and newline '\n':

>>>

                                                        >>>                                        '                                        \t                                                            \n                                          '                    .                    isspace                    ()                    True                    >>>                                        '   a   '                    .                    isspace                    ()                    Fake                                  

Withal, there are a few other ASCII characters that qualify as whitespace, and if yous account for Unicode characters, there are quite a few beyond that:

>>>

                                                        >>>                                        '                    \f\u2005\r                    '                    .                    isspace                    ()                    Truthful                                  

('\f' and '\r' are the escape sequences for the ASCII Form Feed and Carriage Return characters; '\u2005' is the escape sequence for the Unicode Four-Per-Em Space.)

s.istitle()

Determines whether the target string is championship cased.

s.istitle() returns Truthful if due south is nonempty, the kickoff alphabetic character of each word is upper-case letter, and all other alphabetic characters in each word are lowercase. It returns False otherwise:

>>>

                                                        >>>                                        'This Is A Title'                    .                    istitle                    ()                    Truthful                    >>>                                        'This is a title'                    .                    istitle                    ()                    False                    >>>                                        'Requite Me The #$#@ Brawl!'                    .                    istitle                    ()                    True                                  

due south.isupper()

Determines whether the target string's alphabetic characters are uppercase.

south.isupper() returns True if s is nonempty and all the alphabetic characters it contains are uppercase, and False otherwise. Non-alphabetic characters are ignored:

>>>

                                                        >>>                                        'ABC'                    .                    isupper                    ()                    True                    >>>                                        'ABC1$D'                    .                    isupper                    ()                    True                    >>>                                        'Abc1$D'                    .                    isupper                    ()                    False                                  

Cord Formatting

Methods in this group modify or heighten the format of a string.

s.center(<width>[, <fill>])

Centers a string in a field.

s.center(<width>) returns a string consisting of due south centered in a field of width <width>. Past default, padding consists of the ASCII space character:

>>>

                                                        >>>                                        'foo'                    .                    eye                    (                    10                    )                    '   foo    '                                  

If the optional <fill> argument is specified, it is used as the padding graphic symbol:

>>>

                                                        >>>                                        'bar'                    .                    middle                    (                    ten                    ,                    '-'                    )                    '---bar----'                                  

If s is already at least as long every bit <width>, it is returned unchanged:

>>>

                                                        >>>                                        'foo'                    .                    center                    (                    2                    )                    'foo'                                  

s.expandtabs(tabsize=8)

Expands tabs in a string.

s.expandtabs() replaces each tab character ('\t') with spaces. Past default, spaces are filled in assuming a tab stop at every eighth column:

>>>

                                                        >>>                                        'a                    \t                    b                    \t                    c'                    .                    expandtabs                    ()                    'a       b       c'                    >>>                                        'aaa                    \t                    bbb                    \t                    c'                    .                    expandtabs                    ()                    'aaa     bbb     c'                                  

tabsize is an optional keyword parameter specifying alternate tab cease columns:

>>>

                                                        >>>                                        'a                    \t                    b                    \t                    c'                    .                    expandtabs                    (                    4                    )                    'a   b   c'                    >>>                                        'aaa                    \t                    bbb                    \t                    c'                    .                    expandtabs                    (                    tabsize                    =                    four                    )                    'aaa bbb c'                                  

s.ljust(<width>[, <fill up>])

Left-justifies a string in field.

s.ljust(<width>) returns a string consisting of southward left-justified in a field of width <width>. By default, padding consists of the ASCII space character:

>>>

                                                        >>>                                        'foo'                    .                    ljust                    (                    10                    )                    'foo       '                                  

If the optional <fill> statement is specified, it is used as the padding character:

>>>

                                                        >>>                                        'foo'                    .                    ljust                    (                    ten                    ,                    '-'                    )                    'foo-------'                                  

If s is already at least as long equally <width>, it is returned unchanged:

>>>

                                                        >>>                                        'foo'                    .                    ljust                    (                    two                    )                    'foo'                                  

s.lstrip([<chars>])

Trims leading characters from a cord.

s.lstrip() returns a copy of s with whatsoever whitespace characters removed from the left cease:

>>>

                                                        >>>                                        '   foo bar baz   '                    .                    lstrip                    ()                    'foo bar baz   '                    >>>                                        '                    \t\n                    foo                    \t\n                    bar                    \t\n                    baz'                    .                    lstrip                    ()                    'foo\t\nbar\t\nbaz'                                  

If the optional <chars> statement is specified, it is a cord that specifies the set of characters to be removed:

>>>

                                                        >>>                                        'http://www.realpython.com'                    .                    lstrip                    (                    '/:pth'                    )                    'world wide web.realpython.com'                                  

s.replace(<erstwhile>, <new>[, <count>])

Replaces occurrences of a substring within a string.

In Python, to remove a character from a string, you lot can use the Python string .replace() method. s.replace(<old>, <new>) returns a re-create of s with all occurrences of substring <old> replaced by <new>:

>>>

                                                        >>>                                        'foo bar foo baz foo qux'                    .                    supervene upon                    (                    'foo'                    ,                    'grault'                    )                    'grault bar grault baz grault qux'                                  

If the optional <count> argument is specified, a maximum of <count> replacements are performed, starting at the left finish of s:

>>>

                                                        >>>                                        'foo bar foo baz foo qux'                    .                    supervene upon                    (                    'foo'                    ,                    'grault'                    ,                    2                    )                    'grault bar grault baz foo qux'                                  

s.rjust(<width>[, <make full>])

Right-justifies a string in a field.

s.rjust(<width>) returns a string consisting of s right-justified in a field of width <width>. By default, padding consists of the ASCII space character:

>>>

                                                        >>>                                        'foo'                    .                    rjust                    (                    ten                    )                    '       foo'                                  

If the optional <fill> statement is specified, it is used as the padding character:

>>>

                                                        >>>                                        'foo'                    .                    rjust                    (                    ten                    ,                    '-'                    )                    '-------foo'                                  

If s is already at least as long every bit <width>, information technology is returned unchanged:

>>>

                                                        >>>                                        'foo'                    .                    rjust                    (                    2                    )                    'foo'                                  

s.rstrip([<chars>])

Trims trailing characters from a string.

s.rstrip() returns a copy of due south with any whitespace characters removed from the right end:

>>>

                                                        >>>                                        '   foo bar baz   '                    .                    rstrip                    ()                    '   foo bar baz'                    >>>                                        'foo                    \t\n                    bar                    \t\northward                    baz                    \t\n                    '                    .                    rstrip                    ()                    'foo\t\nbar\t\nbaz'                                  

If the optional <chars> argument is specified, information technology is a string that specifies the set of characters to be removed:

>>>

                                                        >>>                                        'foo.$$$;'                    .                    rstrip                    (                    ';$.'                    )                    'foo'                                  

southward.strip([<chars>])

Strips characters from the left and right ends of a string.

s.strip() is essentially equivalent to invoking s.lstrip() and s.rstrip() in succession. Without the <chars> statement, information technology removes leading and trailing whitespace:

>>>

                                                        >>>                                        south                    =                    '   foo bar baz                    \t\t\t                    '                    >>>                                        s                    =                    south                    .                    lstrip                    ()                    >>>                                        southward                    =                    south                    .                    rstrip                    ()                    >>>                                        s                    'foo bar baz'                                  

As with .lstrip() and .rstrip(), the optional <chars> argument specifies the set of characters to be removed:

>>>

                                                        >>>                                        'www.realpython.com'                    .                    strip                    (                    'w.moc'                    )                    'realpython'                                  

s.zfill(<width>)

Pads a string on the left with zeros.

southward.zfill(<width>) returns a re-create of s left-padded with '0' characters to the specified <width>:

>>>

                                                        >>>                                        '42'                    .                    zfill                    (                    five                    )                    '00042'                                  

If s contains a leading sign, it remains at the left edge of the result string afterwards zeros are inserted:

>>>

                                                        >>>                                        '+42'                    .                    zfill                    (                    8                    )                    '+0000042'                    >>>                                        '-42'                    .                    zfill                    (                    eight                    )                    '-0000042'                                  

If due south is already at least every bit long as <width>, it is returned unchanged:

>>>

                                                        >>>                                        '-42'                    .                    zfill                    (                    3                    )                    '-42'                                  

.zfill() is near useful for cord representations of numbers, but Python volition nevertheless happily cipher-pad a cord that isn't:

>>>

                                                        >>>                                        'foo'                    .                    zfill                    (                    6                    )                    '000foo'                                  

Converting Between Strings and Lists

Methods in this grouping convert betwixt a string and some composite data type by either pasting objects together to brand a cord, or by breaking a string upwards into pieces.

These methods operate on or return iterables, the general Python term for a sequential drove of objects. You will explore the inner workings of iterables in much more detail in the upcoming tutorial on definite iteration.

Many of these methods render either a list or a tuple. These are two similar blended data types that are prototypical examples of iterables in Python. They are covered in the next tutorial, then y'all're about to learn nearly them soon! Until then, simply think of them equally sequences of values. A list is enclosed in foursquare brackets ([]), and a tuple is enclosed in parentheses (()).

With that introduction, let's have a look at this last group of string methods.

south.bring together(<iterable>)

Concatenates strings from an iterable.

due south.join(<iterable>) returns the cord that results from concatenating the objects in <iterable> separated by southward.

Note that .bring together() is invoked on south, the separator string. <iterable> must be a sequence of string objects as well.

Some sample lawmaking should help clarify. In the post-obit example, the separator s is the string ', ', and <iterable> is a list of string values:

>>>

                                                        >>>                                        ', '                    .                    join                    ([                    'foo'                    ,                    'bar'                    ,                    'baz'                    ,                    'qux'                    ])                    'foo, bar, baz, qux'                                  

The result is a unmarried cord consisting of the listing objects separated by commas.

In the next example, <iterable> is specified as a single string value. When a string value is used every bit an iterable, it is interpreted equally a list of the string'southward private characters:

>>>

                                                        >>>                                        list                    (                    'corge'                    )                    ['c', 'o', 'r', 'g', 'e']                    >>>                                        ':'                    .                    join                    (                    'corge'                    )                    'c:o:r:g:eastward'                                  

Thus, the result of ':'.bring together('corge') is a string consisting of each character in 'corge' separated by ':'.

This instance fails because ane of the objects in <iterable> is non a cord:

>>>

                                                        >>>                                        '---'                    .                    bring together                    ([                    'foo'                    ,                    23                    ,                    'bar'                    ])                    Traceback (most recent call concluding):                    File                    "<pyshell#0>", line                    1, in                    <module>                    '---'                    .                    join                    ([                    'foo'                    ,                    23                    ,                    'bar'                    ])                    TypeError:                    sequence detail one: expected str example, int found                                  

That can be remedied, though:

>>>

                                                        >>>                                        '---'                    .                    bring together                    ([                    'foo'                    ,                    str                    (                    23                    ),                    'bar'                    ])                    'foo---23---bar'                                  

Every bit you will soon see, many composite objects in Python can exist construed equally iterables, and .join() is particularly useful for creating strings from them.

s.partition(<sep>)

Divides a string based on a separator.

s.segmentation(<sep>) splits s at the first occurrence of string <sep>. The return value is a three-part tuple consisting of:

  • The portion of s preceding <sep>
  • <sep> itself
  • The portion of s post-obit <sep>

Here are a couple examples of .partition() in action:

>>>

                                                        >>>                                        'foo.bar'                    .                    sectionalization                    (                    '.'                    )                    ('foo', '.', 'bar')                    >>>                                        'foo@@bar@@baz'                    .                    partition                    (                    '@@'                    )                    ('foo', '@@', 'bar@@baz')                                  

If <sep> is not found in s, the returned tuple contains s followed by 2 empty strings:

>>>

                                                        >>>                                        'foo.bar'                    .                    partition                    (                    '@@'                    )                    ('foo.bar', '', '')                                  

south.rpartition(<sep>)

Divides a string based on a separator.

s.rpartition(<sep>) functions exactly like due south.partitioning(<sep>), except that s is carve up at the last occurrence of <sep> instead of the starting time occurrence:

>>>

                                                        >>>                                        'foo@@bar@@baz'                    .                    partition                    (                    '@@'                    )                    ('foo', '@@', 'bar@@baz')                    >>>                                        'foo@@bar@@baz'                    .                    rpartition                    (                    '@@'                    )                    ('foo@@bar', '@@', 'baz')                                  

s.rsplit(sep=None, maxsplit=-1)

Splits a string into a list of substrings.

Without arguments, s.rsplit() splits s into substrings delimited past any sequence of whitespace and returns the substrings as a listing:

>>>

                                                        >>>                                        'foo bar baz qux'                    .                    rsplit                    ()                    ['foo', 'bar', 'baz', 'qux']                    >>>                                        'foo                    \n\t                    bar   baz                    \r\f                    qux'                    .                    rsplit                    ()                    ['foo', 'bar', 'baz', 'qux']                                  

If <sep> is specified, it is used as the delimiter for splitting:

>>>

                                                        >>>                                        'foo.bar.baz.qux'                    .                    rsplit                    (                    sep                    =                    '.'                    )                    ['foo', 'bar', 'baz', 'qux']                                  

(If <sep> is specified with a value of None, the string is split delimited by whitespace, just as though <sep> had not been specified at all.)

When <sep> is explicitly given equally a delimiter, consecutive delimiters in s are assumed to circumscribe empty strings, which will be returned:

>>>

                                                        >>>                                        'foo...bar'                    .                    rsplit                    (                    sep                    =                    '.'                    )                    ['foo', '', '', 'bar']                                  

This is non the case when <sep> is omitted, however. In that case, consecutive whitespace characters are combined into a unmarried delimiter, and the resulting list will never incorporate empty strings:

>>>

                                                        >>>                                        'foo                    \t\t\t                    bar'                    .                    rsplit                    ()                    ['foo', 'bar']                                  

If the optional keyword parameter <maxsplit> is specified, a maximum of that many splits are performed, starting from the right end of s:

>>>

                                                        >>>                                        'world wide web.realpython.com'                    .                    rsplit                    (                    sep                    =                    '.'                    ,                    maxsplit                    =                    ane                    )                    ['www.realpython', 'com']                                  

The default value for <maxsplit> is -1, which means all possible splits should be performed—the same as if <maxsplit> is omitted entirely:

>>>

                                                        >>>                                        'www.realpython.com'                    .                    rsplit                    (                    sep                    =                    '.'                    ,                    maxsplit                    =-                    1                    )                    ['www', 'realpython', 'com']                    >>>                                        'www.realpython.com'                    .                    rsplit                    (                    sep                    =                    '.'                    )                    ['world wide web', 'realpython', 'com']                                  

s.dissever(sep=None, maxsplit=-1)

Splits a cord into a list of substrings.

s.split() behaves exactly like s.rsplit(), except that if <maxsplit> is specified, splits are counted from the left cease of s rather than the right end:

>>>

                                                        >>>                                        'world wide web.realpython.com'                    .                    divide                    (                    '.'                    ,                    maxsplit                    =                    one                    )                    ['world wide web', 'realpython.com']                    >>>                                        'www.realpython.com'                    .                    rsplit                    (                    '.'                    ,                    maxsplit                    =                    1                    )                    ['www.realpython', 'com']                                  

If <maxsplit> is not specified, .split() and .rsplit() are duplicate.

s.splitlines([<keepends>])

Breaks a string at line boundaries.

s.splitlines() splits s up into lines and returns them in a list. Whatsoever of the following characters or character sequences is considered to constitute a line purlieus:

Escape Sequence Grapheme
\northward Newline
\r Wagon Return
\r\n Carriage Render + Line Feed
\five or \x0b Line Tabulation
\f or \x0c Class Feed
\x1c File Separator
\x1d Grouping Separator
\x1e Record Separator
\x85 Adjacent Line (C1 Control Code)
\u2028 Unicode Line Separator
\u2029 Unicode Paragraph Separator

Here is an example using several different line separators:

>>>

                                                        >>>                                        'foo                    \n                    bar                    \r\n                    baz                    \f                    qux                    \u2028                    quux'                    .                    splitlines                    ()                    ['foo', 'bar', 'baz', 'qux', 'quux']                                  

If sequent line boundary characters are present in the cord, they are assumed to circumscribe bare lines, which will appear in the result list:

>>>

                                                        >>>                                        'foo                    \f\f\f                    bar'                    .                    splitlines                    ()                    ['foo', '', '', 'bar']                                  

If the optional <keepends> statement is specified and is truthy, then the lines boundaries are retained in the event strings:

>>>

                                                        >>>                                        'foo                    \n                    bar                    \n                    baz                    \n                    qux'                    .                    splitlines                    (                    True                    )                    ['foo\due north', 'bar\north', 'baz\n', 'qux']                    >>>                                        'foo                    \due north                    bar                    \n                    baz                    \due north                    qux'                    .                    splitlines                    (                    ane                    )                    ['foo\n', 'bar\northward', 'baz\n', 'qux']                                  

bytes Objects

The bytes object is ane of the core congenital-in types for manipulating binary data. A bytes object is an immutable sequence of single byte values. Each element in a bytes object is a pocket-size integer in the range 0 to 255.

Defining a Literal bytes Object

A bytes literal is defined in the same manner as a string literal with the addition of a 'b' prefix:

>>>

                                                  >>>                                    b                  =                  b                  'foo bar baz'                  >>>                                    b                  b'foo bar baz'                  >>>                                    type                  (                  b                  )                  <form 'bytes'>                              

As with strings, you lot can employ any of the single, double, or triple quoting mechanisms:

>>>

                                                  >>>                                    b                  'Contains embedded "double" quotes'                  b'Contains embedded "double" quotes'                  >>>                                    b                  "Contains embedded 'single' quotes"                  b"Contains embedded 'single' quotes"                  >>>                                    b                  '''Contains embedded "double" and 'unmarried' quotes'''                  b'Contains embedded "double" and \'single\' quotes'                  >>>                                    b                  """Contains embedded "double" and 'single' quotes"""                  b'Contains embedded "double" and \'single\' quotes'                              

Only ASCII characters are immune in a bytes literal. Any character value greater than 127 must be specified using an appropriate escape sequence:

>>>

                                                  >>>                                    b                  =                  b                  'foo                  \xdd                  bar'                  >>>                                    b                  b'foo\xddbar'                  >>>                                    b                  [                  3                  ]                  221                  >>>                                    int                  (                  0xdd                  )                  221                              

The 'r' prefix may be used on a bytes literal to disable processing of escape sequences, as with strings:

>>>

                                                  >>>                                    b                  =                  rb                  'foo\xddbar'                  >>>                                    b                  b'foo\\xddbar'                  >>>                                    b                  [                  three                  ]                  92                  >>>                                    chr                  (                  92                  )                  '\\'                              

Defining a bytes Object With the Built-in bytes() Function

The bytes() part also creates a bytes object. What sort of bytes object gets returned depends on the argument(s) passed to the role. The possible forms are shown below.

bytes(<south>, <encoding>)

Creates a bytes object from a string.

bytes(<due south>, <encoding>) converts string <southward> to a bytes object, using str.encode() co-ordinate to the specified <encoding>:

>>>

                                                  >>>                                    b                  =                  bytes                  (                  'foo.bar'                  ,                  'utf8'                  )                  >>>                                    b                  b'foo.bar'                  >>>                                    blazon                  (                  b                  )                  <class 'bytes'>                              

bytes(<size>)

Creates a bytes object consisting of nada (0x00) bytes.

bytes(<size>) defines a bytes object of the specified <size>, which must be a positive integer. The resulting bytes object is initialized to nix (0x00) bytes:

>>>

                                                  >>>                                    b                  =                  bytes                  (                  8                  )                  >>>                                    b                  b'\x00\x00\x00\x00\x00\x00\x00\x00'                  >>>                                    type                  (                  b                  )                  <class 'bytes'>                              

bytes(<iterable>)

Creates a bytes object from an iterable.

bytes(<iterable>) defines a bytes object from the sequence of integers generated by <iterable>. <iterable> must be an iterable that generates a sequence of integers north in the range 0 ≤ n ≤ 255:

>>>

                                                  >>>                                    b                  =                  bytes                  ([                  100                  ,                  102                  ,                  104                  ,                  106                  ,                  108                  ])                  >>>                                    b                  b'dfhjl'                  >>>                                    type                  (                  b                  )                  <form 'bytes'>                  >>>                                    b                  [                  2                  ]                  104                              

Operations on bytes Objects

Like strings, bytes objects support the mutual sequence operations:

  • The in and non in operators:

    >>>

                                                                  >>>                                            b                      =                      b                      'abcde'                      >>>                                            b                      'cd'                      in                      b                      True                      >>>                                            b                      'foo'                      not                      in                      b                      True                                      
  • The concatenation (+) and replication (*) operators:

    >>>

                                                                  >>>                                            b                      =                      b                      'abcde'                      >>>                                            b                      +                      b                      'fghi'                      b'abcdefghi'                      >>>                                            b                      *                      3                      b'abcdeabcdeabcde'                                      
  • Indexing and slicing:

    >>>

                                                                  >>>                                            b                      =                      b                      'abcde'                      >>>                                            b                      [                      2                      ]                      99                      >>>                                            b                      [                      1                      :                      iii                      ]                      b'bc'                                      
  • Built-in functions:

    >>>

                                                                  >>>                                            len                      (                      b                      )                      five                      >>>                                            min                      (                      b                      )                      97                      >>>                                            max                      (                      b                      )                      101                                      

Many of the methods defined for string objects are valid for bytes objects besides:

>>>

                                                  >>>                                    b                  =                  b                  'foo,bar,foo,baz,foo,qux'                  >>>                                    b                  .                  count                  (                  b                  'foo'                  )                  3                  >>>                                    b                  .                  endswith                  (                  b                  'qux'                  )                  True                  >>>                                    b                  .                  find                  (                  b                  'baz'                  )                  12                  >>>                                    b                  .                  split                  (                  sep                  =                  b                  ','                  )                  [b'foo', b'bar', b'foo', b'baz', b'foo', b'qux']                  >>>                                    b                  .                  center                  (                  30                  ,                  b                  '-'                  )                  b'---foo,bar,foo,baz,foo,qux----'                              

Detect, however, that when these operators and methods are invoked on a bytes object, the operand and arguments must be bytes objects likewise:

>>>

                                                  >>>                                    b                  =                  b                  'foo.bar'                  >>>                                    b                  +                  '.baz'                  Traceback (about contempo call last):                  File                  "<pyshell#72>", line                  1, in                  <module>                  b                  +                  '.baz'                  TypeError:                  can't concat bytes to str                  >>>                                    b                  +                  b                  '.baz'                  b'foo.bar.baz'                  >>>                                    b                  .                  divide                  (                  sep                  =                  '.'                  )                  Traceback (well-nigh contempo phone call last):                  File                  "<pyshell#74>", line                  ane, in                  <module>                  b                  .                  split                  (                  sep                  =                  '.'                  )                  TypeError:                  a bytes-like object is required, not 'str'                  >>>                                    b                  .                  split                  (                  sep                  =                  b                  '.'                  )                  [b'foo', b'bar']                              

Although a bytes object definition and representation is based on ASCII text, information technology really behaves like an immutable sequence of small integers in the range 0 to 255, inclusive. That is why a single element from a bytes object is displayed as an integer:

>>>

                                                  >>>                                    b                  =                  b                  'foo                  \xdd                  bar'                  >>>                                    b                  [                  3                  ]                  221                  >>>                                    hex                  (                  b                  [                  3                  ])                  '0xdd'                  >>>                                    min                  (                  b                  )                  97                  >>>                                    max                  (                  b                  )                  221                              

A slice is displayed equally a bytes object though, even if it is only ane byte long:

Yous tin can catechumen a bytes object into a list of integers with the born list() function:

>>>

                                                  >>>                                    list                  (                  b                  )                  [97, 98, 99, 100, 101]                              

Hexadecimal numbers are frequently used to specify binary data considering two hexadecimal digits correspond directly to a single byte. The bytes course supports ii boosted methods that facilitate conversion to and from a cord of hexadecimal digits.

bytes.fromhex(<s>)

Returns a bytes object constructed from a string of hexadecimal values.

bytes.fromhex(<south>) returns the bytes object that results from converting each pair of hexadecimal digits in <due south> to the corresponding byte value. The hexadecimal digit pairs in <southward> may optionally be separated past whitespace, which is ignored:

>>>

                                                  >>>                                    b                  =                  bytes                  .                  fromhex                  (                  ' aa 68 4682cc '                  )                  >>>                                    b                  b'\xaahF\x82\xcc'                  >>>                                    listing                  (                  b                  )                  [170, 104, seventy, 130, 204]                              

b.hex()

Returns a string of hexadecimal value from a bytes object.

b.hex() returns the result of converting bytes object b into a string of hexadecimal digit pairs. That is, information technology does the contrary of .fromhex():

>>>

                                                  >>>                                    b                  =                  bytes                  .                  fromhex                  (                  ' aa 68 4682cc '                  )                  >>>                                    b                  b'\xaahF\x82\xcc'                  >>>                                    b                  .                  hex                  ()                  'aa684682cc'                  >>>                                    blazon                  (                  b                  .                  hex                  ())                  <class 'str'>                              

bytearray Objects

Python supports another binary sequence blazon called the bytearray. bytearray objects are very like bytes objects, despite some differences:

  • There is no defended syntax congenital into Python for defining a bytearray literal, like the 'b' prefix that may exist used to ascertain a bytes object. A bytearray object is always created using the bytearray() built-in function:

    >>>

                                                                  >>>                                            ba                      =                      bytearray                      (                      'foo.bar.baz'                      ,                      'UTF-viii'                      )                      >>>                                            ba                      bytearray(b'foo.bar.baz')                      >>>                                            bytearray                      (                      six                      )                      bytearray(b'\x00\x00\x00\x00\x00\x00')                      >>>                                            bytearray                      ([                      100                      ,                      102                      ,                      104                      ,                      106                      ,                      108                      ])                      bytearray(b'dfhjl')                                      
  • bytearray objects are mutable. You can modify the contents of a bytearray object using indexing and slicing:

    >>>

                                                                  >>>                                            ba                      =                      bytearray                      (                      'foo.bar.baz'                      ,                      'UTF-8'                      )                      >>>                                            ba                      bytearray(b'foo.bar.baz')                      >>>                                            ba                      [                      5                      ]                      =                      0xee                      >>>                                            ba                      bytearray(b'foo.b\xeer.baz')                      >>>                                            ba                      [                      eight                      :                      eleven                      ]                      =                      b                      'qux'                      >>>                                            ba                      bytearray(b'foo.b\xeer.qux')                                      

A bytearray object may exist constructed directly from a bytes object also:

>>>

                                                  >>>                                    ba                  =                  bytearray                  (                  b                  'foo'                  )                  >>>                                    ba                  bytearray(b'foo')                              

Conclusion

This tutorial provided an in-depth expect at the many different mechanisms Python provides for string handling, including string operators, built-in functions, indexing, slicing, and built-in methods. You too were introduced to the bytes and bytearray types.

These types are the first types you accept examined that are blended—built from a collection of smaller parts. Python provides several blended built-in types. In the next tutorial, y'all will explore two of the nigh frequently used: lists and tuples.

Watch Now This tutorial has a related video course created by the Real Python squad. Picket it together with the written tutorial to deepen your understanding: Strings and Graphic symbol Data in Python