vak: (Default)
[personal profile] vak
(Статья мне показалась интересной, решил перепостить: 10 Python Scripts to Automate Everyday Problems)

"Устали выполнять повторяющиеся задачи каждый день? Тогда зачем делать это вручную, если вы можете автоматизировать их с помощью вашего любимого языка программирования. В этой статье я представляю вам 10 скриптов Python для автоматизации ваших повседневных проблем и задач."

Fetch IMDB

You probably use IMDB for getting your best movie for a weekend but do you know you can scrap data from IMDB with Python. This automation script will let you automate the IMDb data scraping in a Pythonic way. Below I coded the standard function you can use.
  • You can use it for your IMDB Project
  • Scrap and Analyze Movies Data
  • Finding the Best movie for your Weekend
  • And much more
# IMDB
# pip install imdbpy

import imdb
ia = imdb.IMDb()

# Search for a movie.
search_result = ia.search_movie('The Matrix')

# Get the ID of the movie.
movie_id = search_result[0].movieID

# Get the movie from ID
movie = ia.get_movie(movie_id)

# Get Rating of movie
rating = movie['rating']

# Get Plot of movie
plot = movie['plot']

# Get Genre of movie
genre = movie['genres']

# Get Box office of movie
box_office = movie['box office']

# Get Cast of movie
cast = movie['cast']

# Get Director of movie
director = movie['director']

# Get Writer of movie
writer = movie['writer']

# Search for a person.
search_result = ia.search_person('Keanu Reeves')

Email Fetcher

You have seen the Email sending script in my previous articles but not only sending you can fetch the Email too. This automation script will be your handy tool for fetching emails from Gmail, Outlook, or any mail server. Check the below code.
  • Extract e-mails for a project
  • Extract emails from inbox
  • Much more
# Fetch Emails
# pip install imap-tools

from imap_tools import MailBox

def Fetch_Email(user, pwd):
mbox = MailBox('imap.mail.com').login(user, pwd, "INBOX")
for email in mbox.fetch():
print(email.date, email.subject, len(email.text or email.html))

Fetch_Email("user", "pass")

Analyze Stock Market

Analyze the Stock market in a Pythonic way by using this automation script. This script uses the YFinancey module that programmatically extracts the stock market information and data for you. You can select multiple stocks, analyze data make charts and graphs, and much more.
  • Get Multiple Stocks market
  • Track daily market
  • Script for your Project
  • Script for Creating a market graph
  • much more
# Analyse Stock market
# pip install yfinance

import yfinance as yf
market = yf.Ticker("MSFT")

# Get stockmarket info
info = market.info
print(info)

# Fetch historical data
historical = market.history(period="1y")
print(historical)

# get actions
actions = market.actions
print(actions)

# get dividends
dividends = market.dividends
print(dividends)

# get splits
splits = market.splits
print(splits)

# get balance sheet
balance_sheet = market.balance_sheet
print(balance_sheet)

# get market news
market_news = market.news
print(market_news)

# show earnings
earnings = market.earnings
print(earnings)

# get recommendation
rec = market.recommendation
print(rec)

# Get another Ticker
market1 = yf.Ticker("AAPL")
market2 = yf.Ticker("TSLA")
market3 = yf.Ticker("GOOG")

# Fetch Market data from multiple tickers
market_data = yf.download("AAPL", "GOOG", start="2019-01-01", end="2019-12-31")
print(market_data)

PDF Watermark Remover

Need to remove Watermark from your PDF but don’t know how to do it. Then here is the automation script that uses the PyPDF4 module that will remove the Watermark from your PDF files.

You can use the script for removing watermarks for multiple PDF files by keeping the Quality the same.
# PDF Watermark remover
# pip install PyPDF4

from PyPDF4 import PdfFileReader, PdfFileWriter
from PyPDF4.utils import b_ as io
from PyPDF4.pdf import ContentStream
from PyPDF4.generic import TextStringObject, NameObject

def Watermark_Remover(target_text, pdf_file):
with open(pdf_file, "rb") as f:
pdf = PdfFileReader(f, "rb")
out = PdfFileWriter()
pages = pdf.getNumPages()
for p in range(pages):
page = pdf.getPage(p)
content= page["/Contents"].getObject()
content2= ContentStream(content, pdf)
for op, oper in content2.operations:
if oper == io("Tj"):
txt = op[0]
m = txt.startswith(target_text)
if isinstance(txt, str) and m:
op[0] = TextStringObject('')

page.__setitem__(NameObject('/Contents'), content2)
out.addPage(page)

with open("out.pdf", "wb") as outStream:
out.write(outStream)
target_text = 'Sample'
Watermark_Remover(target_text, "test.pdf")

Image Size Compressor

Script for my making your Images and Photo Sizer lower by keeping the Quality the same. This automation script uses the Pygeutzli module that compresses your Photos to lower their sizes.

This handy script can be used for many purposes.
  • Compress Photos for a Project
  • Bulk Photo Compressor
  • Compressing for your App function
# Compress Photo Size
# pip install pyguetzli

import pyguetzli

def Compress(image):
img = open(image, "rb").read()
optimized = pyguetzli.process_jpeg_bytes(img, quality=80)
output = open("optimized", "wb")
output.write(optimized)Compress("test.jpg")

PDF Extracting

Extract Text, Images, and Tables from your PDF by using this automation script which uses three different modules. Below you can find the coded script that you are free to use.
  • Bulk PDF Extraction
  • Extracting Tables from PDF
  • PDF extraction for Project
  • Much more
# PDF Extracting with Python
# pip install textract
# pip install tabula-py
# pip install PyMupdf

import textract as extract
import tabula as tb
import fitz

def Extract_Text(pdf):
pdf = extract.process('test.pdf')
print("Text: ", pdf)

def Extract_Photos(pdf):
doc = fitz.open('test.pdf')
i = 1
for page in doc:
for img in page.getImageList():
xref = img[0]
pix = page.getPixmap(xref)
pix.writePNG(f'test_{i}.png')
print("Image: ", pix)
i += 1

def Extract_Tables(pdf):
table = tb.read_pdf('test.pdf', pages='all', multiple_tables=True)
# save in csv
tb.convert_into('test.pdf', 'test.csv', output_format='csv', pages='all')
# save in excel
tb.convert_into('test.pdf', 'test.xlsx', output_format='xlsx', pages='all')

PySimpleGui

Create an eye-catching and beautiful Gui with this script which uses the PySimpleGui module. This module is simpler and has the power for creating apps for anything in python.
  • Creating Gui apps for your Project
  • Creating an app for Graphs
  • Creating an app for Machine learning
  • Much more
#!/usr/bin/env python3
# pip install PySimpleGUI
# brew install python-tk

import PySimpleGUI as gui

layout = []

# Label Text
text = gui.Text('This is PysimpleGui', size=(30, 1))
layout.append([text])

# Button
button = gui.Button('Click Me')
layout.append([button])

# Input Box
input_box = gui.Input(key='-IN-')
layout.append([input_box])

# Browse Folders
browse_folder = gui.FolderBrowse()
layout.append([browse_folder])

# Set Image
image = gui.Image('img.png')
layout.append([image])

# Radio Buttons
radio = [
    gui.Radio('Radio A', 1),
    gui.Radio('Radio B', 1),
    gui.Radio('Radio C', 1),
]
layout.append([radio])

# Check Boxes
check = gui.Checkbox('Check', default=True)
layout.append([check])

# Set window
win = gui.Window('Window Title', layout, auto_size_text=True)

while True:
    event, values = win.read()
    #gui.Print(event, values)

    if event == gui.WIN_CLOSED or event == 'Exit':
        break

win.close()

Merge CSV Files

The simple Automation script will let you merge your multiple CSV files into one file. It will also help you clear the duplicates while merging.
# Merge CSV Files
# pip install pandas

from pandas import read_csv
import pandas
import os

def Merge_Csv(files):
df = pandas.concat(map(read_csv, files), ignore_index=True)
df.to_csv("merged.csv", index=False)
print("CSV has Been Merged...")

Merge_Csv(["movies.csv", "movies2.csv", "movies3.csv"])

Automate Databases

Databases are the organized collection of your Data and we need them in our daily life. But we can organize and Fetch the database with python too. This script use MySql-connecter will help you to connect to your database and let you fetch or execute any SQL Query.
  • Use the Script in your Project
  • Script for Fetching Database
  • Script for Updating Database
# Database with Python
# pip install mysql-connector-python

import mysql.connector

# Connect to yout SQL database
sql = mysql.connector.connect(
host="Your host",
user="username",
passwd="",
database="mydatabase_1"
)

# create table
cursor = sql.cursor()
cursor.execute("CREATE TABLE movies (title VARCHAR(255), rating VARCHAR(255))")

# insert data
query = "INSERT INTO movies (title, rating) VALUES (%s, %s)"
value = ("The Matrix", "7.5")
cursor.execute(query, value)

# Select Data
cursor.execute("SELECT * FROM movies")
myresult = cursor.fetchall()
for x in myresult:
print(x)

# Delete Data
cursor.execute("DELETE FROM movies WHERE title = 'The Matrix'")

# Get Specific Data
cursor.execute("SELECT * FROM movies WHERE title = 'The Matrix'")
myresult = cursor.fetchall()

# Update Data
cursor.execute("UPDATE movies SET rating = '8' WHERE title = 'The Matrix'")

# Delete Table
cursor.execute("DROP TABLE movies")

# Close Connection
sql.close()

Reddit Bot

Reddit is an awesome social media platform but you know you can extract and create a bot for Reddit too in Python. This script is a simple explanation that will let you create a super Reddit bot by using the Praw module.
  • Create a Reddit bot for Project
  • Fetch Reddit data
  • Bot tracking Subreddit
  • Much more
# Reddit Bot
# pip install praw

import praw

reddit = praw.Reddit(client_id='',
client_secret='',
user_agent='',
username='',
password='' )

# Get the SubReddit
subreddit = reddit.subreddit('python')

# Get the top 10 posts
for sub in subreddit.hot(limit=10):
print(sub.title)

# Get Info of a post
sub = subreddit.hot(limit=1)
print(sub.title)
print(sub.author)
print(sub.id)
print(sub.ups)
print(sub.downs)
print(sub.visited)

# Get the comments of a post
for comment in sub.comments:
print(comment.body)

# Get Permalink of Comment
for comment in sub.comments:
print(comment.permalink)

# Get the replies of a comment
for comment in sub.comments:
for reply in comment.replies:
print(reply.body)

# Get Score of a post
for sub in subreddit.hot(limit=1):
print(sub.score)

Date: 2022-07-29 21:54 (UTC)
euthanasepam: Delirium Tremens (Delirium_Tremens)
From: [personal profile] euthanasepam
> pip install imdbpy


OMG!

Date: 2022-07-29 22:04 (UTC)
juan_gandhi: (Default)
From: [personal profile] juan_gandhi

An amazing collection. Thanks!

Date: 2022-07-29 22:49 (UTC)
spamsink: (Default)
From: [personal profile] spamsink
"There is an app a pip for that".

Интересно, неужели во всех PDF водяные знаки делаются одинаково?


Date: 2022-07-30 00:53 (UTC)
euthanasepam: Ла-ла-ла-ла! Ла-ла-ла-ла! (Default)
From: [personal profile] euthanasepam
Не для того Гвидон придумал Питон, чтобы проверять такие глупости!

Date: 2022-07-30 01:03 (UTC)
euthanasepam: Ла-ла-ла-ла! Ла-ла-ла-ла! (Default)
From: [personal profile] euthanasepam
Почему такой скепцитизм по поводу этих pip-нутых питонячьих программ… Однажды я мучился (про это писал где-то в комментах годичной давности в журнале у [personal profile] rampitec), используя готовый xmlformatter для форматирования XML-файлов — книжек в формате FB2. Оно делало всё то, что мне надо, и так, как мне надо. Но ужасно тормозило и нагружало процессор. Потом попробовал для тех же целей xmllint. И мир заиграл новыми красками. Набросал cmd-скрипт, который проверяет всё, что надо проверить, и обрабатывает файлы в директории, откуда его позвали. Чисто для опытов у меня 4 тома «Войны и мира» с Флибусты, которые занимают на диске немного меньше 6 МБ. Так вот, xmllint, вызываемый циклом из скрипта, обрабатывает эти четыре тома за доли секунды.



P. S.

Форматировать их надо для того, чтобы дальше пропускать через средства исправления неправильной типографики.

Edited Date: 2022-07-30 01:09 (UTC)