Question
In Python 3, I captured the standard output of an external program as a bytes object using subprocess:
from subprocess import Popen, PIPE
stdout = Popen(['ls', '-l'], stdout=PIPE).communicate()[0]
print(stdout)
This produces output like:
b'total 0\n-rw-rw-r-- 1 thomas thomas 0 Mar 3 07:03 file1\n-rw-rw-r-- 1 thomas thomas 0 Mar 3 07:03 file2\n'
I want to convert this bytes value into a normal Python str so I can work with it as text and print readable output.
How do you convert a bytes object to a str in Python 3?
Short Answer
By the end of this page, you will understand the difference between bytes and str in Python 3, how to convert bytes to text using .decode(), when to choose an encoding such as UTF-8, and how to avoid common mistakes when reading subprocess output or other binary data.
Concept
In Python 3, bytes and str are different data types.
bytesrepresent raw binary data.strrepresents human-readable text made of Unicode characters.
This distinction matters because computers store data as bytes, but your program usually wants to work with text.
For example, when you read from:
- a file opened in binary mode
- a network socket
- subprocess output
- an API response
Python often gives you bytes, not str.
To convert bytes into text, you use an encoding. The most common encoding is UTF-8.
text = some_bytes.decode('utf-8')
Why this matters:
- Text operations like
.split(),.replace(), or string formatting are meant forstr - Mixing
bytesandstrcauses errors - Correct decoding prevents garbled characters and encoding bugs
Mental Model
Think of bytes as a sealed package of raw data and str as the readable message inside.
bytes= the packaged form computers transmit and storestr= the unpacked text humans read and edit
decode() is the unpacking step.
If the package was packed using UTF-8, you must unpack it using UTF-8 too. If you use the wrong encoding, the message may look corrupted or fail to open.
Syntax and Examples
The basic syntax is:
text = byte_data.decode('utf-8')
If you already know the data uses UTF-8, this is the usual solution.
Example: decoding subprocess output
from subprocess import Popen, PIPE
stdout = Popen(['ls', '-l'], stdout=PIPE).communicate()[0]
text = stdout.decode('utf-8')
print(text)
This converts the bytes result into a normal Python string.
Example: simple bytes to string conversion
data = b'Hello, world!'
text = data.decode('utf-8')
print(text)
print(type(text))
Output:
Hello, world!
<class 'str'>
Better subprocess approach: get text directly
Instead of decoding later, you can ask subprocess to return text:
subprocess run, PIPE
result = run([, ], stdout=PIPE, text=)
(result.stdout)
Step by Step Execution
Consider this example:
data = b'Python\n'
text = data.decode('utf-8')
print(text)
Here is what happens step by step:
-
data = b'Python\n'datais abytesobject.- It contains raw byte values representing the characters
P,y,t,h,o,n, and a newline.
-
text = data.decode('utf-8')- Python reads those bytes using the UTF-8 encoding.
- It converts them into a Unicode string.
- Now
textis astrobject.
-
print(text)print()displays the string as text.
Real World Use Cases
Converting bytes to strings is common in many real programs.
Reading subprocess output
from subprocess import run, PIPE
result = run(['echo', 'hello'], stdout=PIPE)
text = result.stdout.decode('utf-8')
Used for:
- shell command wrappers
- automation scripts
- deployment tools
Reading HTTP responses
response_bytes = b'{"status": "ok"}'
text = response_bytes.decode('utf-8')
Used for:
- API clients
- web scraping
- integrations with external services
Reading binary files that actually contain text
with open('notes.txt', 'rb') as f:
data = f.read()
text = data.decode('utf-8')
Used when:
- working with files in binary mode
- parsing exported data
- processing logs
Receiving socket data
packet =
message = packet.decode()
Real Codebase Usage
In real projects, developers usually try to convert data to text as early as possible if the data is meant to be text.
Common patterns
1. Decode immediately after reading bytes
output = process.communicate()[0].decode('utf-8')
This keeps later code simple because everything after that works with str.
2. Use text mode in subprocess
from subprocess import run, PIPE
result = run(['ls', '-l'], stdout=PIPE, text=True)
output = result.stdout
This is cleaner than manual decoding when you expect text.
3. Specify encodings explicitly
text = data.decode('utf-8')
Being explicit avoids environment-dependent behavior.
4. Handle decoding errors carefully
text = data.decode('utf-8', errors='replace')
Useful when input may contain unexpected bytes.
5. Validate before processing
Common Mistakes
1. Calling str() on bytes
This is a very common mistake.
data = b'hello'
text = str(data)
print(text)
Output:
b'hello'
This does not decode the bytes. It creates a string representation of the bytes object.
Correct
text = data.decode('utf-8')
2. Using the wrong encoding
data = 'café'.encode('utf-8')
text = data.decode('latin-1')
print(text)
This may produce incorrect characters.
Fix
Decode using the same encoding that was used to create the bytes.
3. Mixing bytes and str
name = 'Alice'
data = b'Hello '
result = data + name
Comparisons
| Concept | Purpose | Result type | Typical use |
|---|---|---|---|
bytes.decode('utf-8') | Convert bytes to text | str | Reading subprocess output, files, network data |
str.encode('utf-8') | Convert text to bytes | bytes | Writing files, sending data over network |
str(bytes_obj) | String representation of object | str | Debugging only, not real decoding |
subprocess(..., text=True) | Return text directly | str |
Cheat Sheet
Convert bytes to string
text = data.decode('utf-8')
Convert string to bytes
data = text.encode('utf-8')
Get subprocess output as text directly
from subprocess import run, PIPE
result = run(['ls', '-l'], stdout=PIPE, text=True)
print(result.stdout)
Safe decoding with invalid bytes
text = data.decode('utf-8', errors='replace')
Important rules
bytesandstrare different in Python 3- Use
.decode()to turnbytesintostr - Use
.encode()to turnstrintobytes
FAQ
How do I convert bytes to string in Python 3?
Use .decode():
text = data.decode('utf-8')
Why does Python 3 separate bytes and str?
Because raw binary data and text are not the same thing. This makes encoding issues more explicit and reduces hidden bugs.
Is str(my_bytes) the right way to decode bytes?
No. It returns a representation like "b'hello'", not the actual decoded text.
What encoding should I use when decoding bytes?
Usually UTF-8, unless you know the data uses a different encoding.
How do I get subprocess output as a string directly?
Use text=True with subprocess.run() or related functions:
result = run(['ls', '-l'], stdout=PIPE, text=True)
What if decoding raises UnicodeDecodeError?
The bytes may use a different encoding, or contain invalid data. Try the correct encoding or use:
data.decode('utf-8', errors=)
Mini Project
Description
Build a small Python utility that runs an external command, captures its output, converts it from bytes to str, and prints each line with line numbers. This demonstrates a very common workflow in automation scripts and command-line tools.
Goal
Create a script that executes a command, decodes its output as UTF-8 text, and processes the result as normal Python strings.
Requirements
- Run an external command using
subprocess. - Capture standard output.
- Convert the output from
bytestostr. - Split the text into lines.
- Print each non-empty line with a line number.
Keep learning
Related questions
@staticmethod vs @classmethod in Python Explained
Learn the difference between @staticmethod and @classmethod in Python with clear examples, use cases, mistakes, and a mini project.
Catch Multiple Exceptions in One except Block in Python
Learn how to catch multiple exceptions in one Python except block using tuples, with examples, mistakes, and real-world usage.
Does Python Have a Ternary Conditional Operator? Conditional Expressions in Python
Learn whether Python has a ternary operator, how conditional expressions work, and when to use them with clear Python examples.