Question

Convert Bytes to String in Python 3

pythonstringpython-3.x

Question

In Python 3, I captured the standard output of an external program as a bytes object using subprocess:

from subprocess import Popen, PIPE

stdout = Popen(['ls', '-l'], stdout=PIPE).communicate()[0]
print(stdout)

This produces output like:

b'total 0\n-rw-rw-r-- 1 thomas thomas 0 Mar  3 07:03 file1\n-rw-rw-r-- 1 thomas thomas 0 Mar  3 07:03 file2\n'

I want to convert this bytes value into a normal Python str so I can work with it as text and print readable output.

How do you convert a bytes object to a str in Python 3?

Short Answer

By the end of this page, you will understand the difference between bytes and str in Python 3, how to convert bytes to text using .decode(), when to choose an encoding such as UTF-8, and how to avoid common mistakes when reading subprocess output or other binary data.

Concept

In Python 3, bytes and str are different data types.

bytes represent raw binary data.
str represents human-readable text made of Unicode characters.

This distinction matters because computers store data as bytes, but your program usually wants to work with text.

For example, when you read from:

a file opened in binary mode
a network socket
subprocess output
an API response

Python often gives you bytes, not str.

To convert bytes into text, you use an encoding. The most common encoding is UTF-8.

text = some_bytes.decode('utf-8')

Why this matters:

Text operations like .split(), .replace(), or string formatting are meant for str
Mixing bytes and str causes errors
Correct decoding prevents garbled characters and encoding bugs

Mental Model

Think of bytes as a sealed package of raw data and str as the readable message inside.

bytes = the packaged form computers transmit and store
str = the unpacked text humans read and edit

decode() is the unpacking step.

If the package was packed using UTF-8, you must unpack it using UTF-8 too. If you use the wrong encoding, the message may look corrupted or fail to open.

Take Quiz

Syntax and Examples

The basic syntax is:

text = byte_data.decode('utf-8')

If you already know the data uses UTF-8, this is the usual solution.

Example: decoding subprocess output

from subprocess import Popen, PIPE

stdout = Popen(['ls', '-l'], stdout=PIPE).communicate()[0]
text = stdout.decode('utf-8')

print(text)

This converts the bytes result into a normal Python string.

Example: simple bytes to string conversion

data = b'Hello, world!'
text = data.decode('utf-8')

print(text)
print(type(text))

Output:

Hello, world!
<class 'str'>

Better subprocess approach: get text directly

Instead of decoding later, you can ask subprocess to return text:

 subprocess  run, PIPE

result = run([, ], stdout=PIPE, text=)
(result.stdout)

Step by Step Execution

Consider this example:

data = b'Python\n'
text = data.decode('utf-8')
print(text)

Here is what happens step by step:

data = b'Python\n'
- data is a bytes object.
- It contains raw byte values representing the characters P, y, t, h, o, n, and a newline.
text = data.decode('utf-8')
- Python reads those bytes using the UTF-8 encoding.
- It converts them into a Unicode string.
- Now text is a str object.
print(text)
- print() displays the string as text.

Real World Use Cases

Converting bytes to strings is common in many real programs.

Reading subprocess output

from subprocess import run, PIPE

result = run(['echo', 'hello'], stdout=PIPE)
text = result.stdout.decode('utf-8')

Used for:

shell command wrappers
automation scripts
deployment tools

Reading HTTP responses

response_bytes = b'{"status": "ok"}'
text = response_bytes.decode('utf-8')

Used for:

API clients
web scraping
integrations with external services

Reading binary files that actually contain text

with open('notes.txt', 'rb') as f:
    data = f.read()

text = data.decode('utf-8')

Used when:

working with files in binary mode
parsing exported data
processing logs

Receiving socket data

packet = 
message = packet.decode()

Real Codebase Usage

In real projects, developers usually try to convert data to text as early as possible if the data is meant to be text.

Common patterns

1. Decode immediately after reading bytes

output = process.communicate()[0].decode('utf-8')

This keeps later code simple because everything after that works with str.

2. Use text mode in `subprocess`

from subprocess import run, PIPE

result = run(['ls', '-l'], stdout=PIPE, text=True)
output = result.stdout

This is cleaner than manual decoding when you expect text.

3. Specify encodings explicitly

text = data.decode('utf-8')

Being explicit avoids environment-dependent behavior.

4. Handle decoding errors carefully

text = data.decode('utf-8', errors='replace')

Useful when input may contain unexpected bytes.

5. Validate before processing

Common Mistakes

1. Calling `str()` on bytes

This is a very common mistake.

data = b'hello'
text = str(data)
print(text)

Output:

b'hello'

This does not decode the bytes. It creates a string representation of the bytes object.

Correct

text = data.decode('utf-8')

2. Using the wrong encoding

data = 'café'.encode('utf-8')
text = data.decode('latin-1')
print(text)

This may produce incorrect characters.

Fix

Decode using the same encoding that was used to create the bytes.

3. Mixing `bytes` and `str`

name = 'Alice'
data = b'Hello '
result = data + name

Comparisons

Concept	Purpose	Result type	Typical use
`bytes.decode('utf-8')`	Convert bytes to text	`str`	Reading subprocess output, files, network data
`str.encode('utf-8')`	Convert text to bytes	`bytes`	Writing files, sending data over network
`str(bytes_obj)`	String representation of object	`str`	Debugging only, not real decoding
`subprocess(..., text=True)`	Return text directly	`str`

Cheat Sheet

Convert bytes to string

text = data.decode('utf-8')

Convert string to bytes

data = text.encode('utf-8')

Get subprocess output as text directly

from subprocess import run, PIPE
result = run(['ls', '-l'], stdout=PIPE, text=True)
print(result.stdout)

Safe decoding with invalid bytes

text = data.decode('utf-8', errors='replace')

Important rules

bytes and str are different in Python 3
Use .decode() to turn bytes into str
Use .encode() to turn str into bytes

FAQ

How do I convert bytes to string in Python 3?

Use .decode():

text = data.decode('utf-8')

Why does Python 3 separate bytes and str?

Because raw binary data and text are not the same thing. This makes encoding issues more explicit and reduces hidden bugs.

Is `str(my_bytes)` the right way to decode bytes?

No. It returns a representation like "b'hello'", not the actual decoded text.

What encoding should I use when decoding bytes?

Usually UTF-8, unless you know the data uses a different encoding.

How do I get subprocess output as a string directly?

Use text=True with subprocess.run() or related functions:

result = run(['ls', '-l'], stdout=PIPE, text=True)

What if decoding raises `UnicodeDecodeError`?

The bytes may use a different encoding, or contain invalid data. Try the correct encoding or use:

data.decode('utf-8', errors=)

Related Concepts

str.encode() — the reverse operation of converting text into bytes.
Unicode — explains how Python text works internally.
Character encodings — essential for understanding UTF-8, Latin-1, and decoding errors.
subprocess.run() — commonly used when working with command output.
File I/O modes — helps explain text mode versus binary mode.
UnicodeDecodeError — important when decoding fails.
Binary data vs text data — clarifies when to keep data as bytes instead of converting it.

Take Quiz

Mini Project

Description

Build a small Python utility that runs an external command, captures its output, converts it from bytes to str, and prints each line with line numbers. This demonstrates a very common workflow in automation scripts and command-line tools.

Goal

Create a script that executes a command, decodes its output as UTF-8 text, and processes the result as normal Python strings.

Requirements

Run an external command using subprocess.
Capture standard output.
Convert the output from bytes to str.
Split the text into lines.
Print each non-empty line with a line number.

Take Quiz

Keep learning

Approach	When to use	Example
`.decode()`	You already have bytes	`stdout.decode('utf-8')`
`text=True`	You control the subprocess call	`run(cmd, stdout=PIPE, text=True)`

Type	Contains	Prefix example
`bytes`	Raw binary data	`b'hello'`
`str`	Unicode text	`'hello'`

Convert Bytes to String in Python 3

Question

Short Answer

Concept

Mental Model

Syntax and Examples

Example: decoding subprocess output

Example: simple bytes to string conversion

Better subprocess approach: get text directly

Step by Step Execution

Real World Use Cases

Reading subprocess output

Reading HTTP responses

Reading binary files that actually contain text

Receiving socket data

Real Codebase Usage

Common patterns

1. Decode immediately after reading bytes

2. Use text mode in subprocess

3. Specify encodings explicitly

4. Handle decoding errors carefully

5. Validate before processing

Common Mistakes

1. Calling str() on bytes

Correct

2. Using the wrong encoding

Fix

3. Mixing bytes and str

Comparisons

Cheat Sheet

Convert bytes to string

Convert string to bytes

Get subprocess output as text directly

Safe decoding with invalid bytes

Important rules

FAQ

How do I convert bytes to string in Python 3?

Why does Python 3 separate bytes and str?

Is str(my_bytes) the right way to decode bytes?

What encoding should I use when decoding bytes?

How do I get subprocess output as a string directly?

What if decoding raises UnicodeDecodeError?

Related Concepts

Mini Project

Description

Goal

Requirements

Related questions

@staticmethod vs @classmethod in Python Explained

Call a Function by Name in a Python Module

Catch Multiple Exceptions in One except Block in Python

Handling bad characters safely

Trace with types

Why teams prefer these patterns

Fix

4. Forgetting that subprocess returns bytes by default

Better

5. Ignoring decoding errors

Safer option

.decode() vs text=True

bytes vs str

Common type checks

Typical subprocess pattern

Should I always decode bytes immediately?

2. Use text mode in `subprocess`

1. Calling `str()` on bytes

3. Mixing `bytes` and `str`

Is `str(my_bytes)` the right way to decode bytes?

What if decoding raises `UnicodeDecodeError`?

`.decode()` vs `text=True`

`bytes` vs `str`