keywords: Python, IO Stream Notes

How to read and write unicode (UTF-8) files

source:

import io
with io.open(filename,'r',encoding='utf8') as f:
    text = f.read()
# process Unicode text
with io.open(filename,'w',encoding='utf8') as f:
    f.write(text)

Origin:
https://www.tutorialspoint.com/How-to-read-and-write-unicode-UTF-8-files-in-Python

How to writelines with newline
with open(dst_filename, 'w') as f:
    f.writelines(f'{s}\n' for s in lines)

Origin:
https://stackoverflow.com/a/67061327/1645289

How to check whether a file contains valid UTF-8
def try_utf8(data):
    "Returns a Unicode object on success, or None on failure"
    try:
       return data.decode('utf-8')
    except UnicodeDecodeError:
       return None

def test():
    with io.open(r"D:\test.txt",'rb') as f:
        data = f.read()
        udata = try_utf8(data)
        if udata is None:
            print('no utf8')
        else:
            print('is utf8')
How to convert dictionary to string

source:

str(dict)
Convert a python dictionary to a string and back

source:

import json

# convert to string
input = json.dumps({'id': id })

# load to dict
my_dict = json.loads(input) 

https://stackoverflow.com/questions/4547274/convert-a-python-dict-to-a-string-and-back

List files in particular directory

e.g.:

import os
arr = os.listdir('D:/test/')
print(arr)

>>> ['$RECYCLE.BIN', 'work.txt', '3ebooks.txt', 'documents']

How do I list all files of a directory?
https://stackoverflow.com/questions/3207219/how-do-i-list-all-files-of-a-directory

Create a directory

e.g.

import os
if not os.path.exists(directory):
    os.makedirs(directory)

How can I safely create a nested directory?
https://stackoverflow.com/questions/273192/how-can-i-safely-create-a-nested-directory

Remove a directory

If you want to delete the file

import os
os.remove("path_to_file")

but you can`t delete directory by using above code if you want to remove directory then use this

import os
os.rmdir("path_to_dir")

from above command, you can delete a directory if it’s empty if it’s not empty then you can use shutil module

import shutil
shutil.rmtree("path_to_dir")

All above method are Python way and if you know about your operating system that this method depends on OS all above method is not dependent

import os
os.system("rm -rf _path_to_dir")

How to remove a directory including all its files in python?
https://stackoverflow.com/questions/43756284/how-to-remove-a-directory-including-all-its-files-in-python

Example 2:

import shutil

shutil.rmtree('/folder_name', ignore_errors=True)

How do I remove/delete a folder that is not empty?
https://stackoverflow.com/questions/303200/how-do-i-remove-delete-a-folder-that-is-not-empty

List files in directory recursively

Recursive sub folder search and return files in a list (since Python 3.5):

import glob

# my_path/     the dir
# **/       every file and dir under my_path
# *.txt     every file that ends with '.txt'
files = glob.glob(my_path + '/**/*.txt', recursive=True)

for file in glob.iglob(my_path, recursive=False):
# ...

Reference:
https://stackoverflow.com/a/40755802/1645289

Check if path is file or directory
fpath = 'D:/workspace/python/'
isFile = os.path.isfile(fpath)
isDirectory = os.path.isdir(fpath)
Combine file path string
os.path.join('C:\\abc', 'def.jpg')
Get file size
>>> import os
>>> os.path.getsize("/path/to/file.mp3")
2071611

or:

os.stat(filename).st_size

Origin:
https://stackoverflow.com/a/2104083/1645289

List files and directories in a specific directory
import os

drc = 'D:/workspace/python_dev'

for dirpath, dirname, filename in os.walk(drc):#Getting a list of the full paths of files
    print(dirpath)
    
print('++++++++++++++++')

for dirpath, dirname, filename in os.walk(drc):
    for fname in dirname:
        print(fname)
    
print('++++++++++++++++')
    
for dirpath, dirname, filename in os.walk(drc):
    for fname in filename:
        print(fname)

Output:

D:/workspace/python_dev
D:/workspace/python_dev\a
D:/workspace/python_dev\a\b
D:/workspace/python_dev\a\c
++++++++++++++++
a
b
c
++++++++++++++++
batch_replace_string.py
000.txt
666.txt
111.txt
222.txt
333.txt
444.txt
How to read all lines without newline from text file
with open(filename, 'r', encoding='utf8') as fileobj:
    for line in fileobj:
        print(line.rstrip('\n'))
Replace multiple words in a file
checkWords = ("old_text1","old_text2","old_text3","old_text4")
repWords = ("new_text1","new_text2","new_text3","new_text4")

for line in f1:
    for check, rep in zip(checkWords, repWords):
        line = line.replace(check, rep)
    f2.write(line)
f1.close()
f2.close()

Reference:
find and replace multiple words in a file python
https://stackoverflow.com/a/51240945/1645289

Replace a string in multiple files with in a directory
import os
import glob
import io

dir = 'D:/test'

dic = { "src": "dest", "aaa": "bbb"}
        
def replace_all(text, dic):
    for i, j in dic.items():
        text = text.replace(i, j)
    return text
        
for file in glob.iglob(dir + '/**/*.txt', recursive=True):
    print(file)
    new_text = ''
    with io.open(file,'r',encoding='utf8') as f:
        text = f.read()
    
        new_text = replace_all(text, dic);
        
    with io.open(file,'w',encoding='utf8') as f:
        f.write(new_text)
Encode string using base64
>>> data = base64.b64encode(b'data to be encoded')
>>> print(data)
b'ZGF0YSB0byBiZSBlbmNvZGVk'

Or

>>> string = 'data to be encoded'
>>> data = base64.b64encode(string.encode())
>>> print(data)
b'ZGF0YSB0byBiZSBlbmNvZGVk'

Reference:
https://stackoverflow.com/a/41437531/1645289

Convert binary to string using Python

Example:

# Python3 code to demonstrate working of  
# Converting binary to string 
# Using BinarytoDecimal(binary)+chr() 

# Defining BinarytoDecimal() function 
def BinaryToDecimal(binary):  
    binary1 = binary  
    decimal, i, n = 0, 0, 0
    while(binary != 0):  
        dec = binary % 10
        decimal = decimal + dec * pow(2, i)  
        binary = binary//10
        i += 1
    return (decimal)     
  
# Driver's code  
# initializing binary data 
bin_data ='10001111100101110010111010111110011'
   
# print binary data 
print("The binary value is:", bin_data) 
   
# initializing a empty string for  
# storing the string data 
str_data =' '
   
# slicing the input and converting it  
# in decimal and then converting it in string 
for i in range(0, len(bin_data), 7): 
      
    # slicing the bin_data from index range [0, 6] 
    # and storing it as integer in temp_data 
    temp_data = int(bin_data[i:i + 7]) 
       
    # passing temp_data in BinarytoDecimal() function 
    # to get decimal value of corresponding temp_data 
    decimal_data = BinaryToDecimal(temp_data) 
       
    # Deccoding the decimal value returned by  
    # BinarytoDecimal() function, using chr()  
    # function which return the string corresponding  
    # character for given ASCII value, and store it  
    # in str_data 
    str_data = str_data + chr(decimal_data)  
   
# printing the result 
print("The Binary value after string conversion is:",  
       str_data) 

Output:

The binary value is: 10001111100101110010111010111110011
The Binary value after string conversion is:  Geeks

https://www.geeksforgeeks.org/convert-binary-to-string-using-python/

Current working directory & The directory of the script being run

For the directory of the script being run:

import pathlib
pathlib.Path(__file__).parent.absolute()

For the current working directory (cmd running from):

import pathlib
pathlib.Path().absolute()

or:

import os
os.getcwd()

https://stackoverflow.com/questions/3430372/how-do-i-get-the-full-path-of-the-current-files-directory

Generating an MD5 checksum of a file

Python 3.8+:

import hashlib
with open("your_filename.txt", "rb") as f:
    file_hash = hashlib.md5()
    while chunk := f.read(8192):
        file_hash.update(chunk)

print(file_hash.digest())
print(file_hash.hexdigest())  # to get a printable str instead of bytes

Origin:
https://stackoverflow.com/a/59056796/1645289


There is always some madness in love. But there is also always some reason in madness. ― Friedrich Nietzsche