Some time ago I needed to add height
and width
attributes to all images on this blog. So I wrote a python script, here is the main idea:
- Iterate over all blog posts.
- Find all
img
tags in each post. - Extract the content of the
src
attribute. - Open the image and extract its size.
- Write image size in the
img
attribueswidth
andheight
.
To parse the data, I’ve used BeautifulSoup
:
#!/bin/python
from BeautifulSoup import BeautifulSoup
from os.path import basename, splitext
from PIL import Image
import glob
# Path where the posts are, in markdown format
path = "/ruta/ficheros/*.md"
# Iterate over all posts
for fname in glob.glob(path):
# Open the post
f = open(fname)
# Create a BeautifulSoup object to parse the file
soup = BeautifulSoup(f)
f.close()
# For each img tag:
for img in soup.findAll('img'):
if img != None:
try:
if img['src'].startswith("/assets") == True:
# Open the image
pil = Image.open("/ruta/carpeta/imagenes" + img['src'])
# Get its size
width, height = pil.size
# Modify img tag with image size
img['width'] = str(width) + "px"
img['height'] = str(height) + "px"
except KeyError:
pass
# Save the updated post
with open(fname, "wb") as file:
file.write(str(soup))
Hope you find it useful, you can visit the script at github.
References
- How to save back changes made to a HTML file using BeautifulSoup in Python? | stackoverflow.com
- Replace SRC of all IMG elements using Parser | stackoverflow.com
Spot a typo?: Help me fix it by contacting me or commenting below!