Sunday, September 24, 2017

Python zip multiple directories into one zip file

Leave a Comment

I have a top directory ds237 which has multiple sub-directories under it as below:

ds237/ ├── dataset_description.json ├── derivatives ├── sub-01 ├── sub-02 ├── sub-03 ├── sub-04 ├── sub-05 ├── sub-06 ├── sub-07 ├── sub-08 ├── sub-09 ├── sub-10 ├── sub-11 ├── sub-12 ├── sub-13 ├── sub-21 ├── sub-22 ├── sub-23 ├── sub-24 ├── sub-25 ├── sub-26 ├── sub-27 ├── sub-28 ├── sub-29

I am trying to create multiple zip files(with proper zip names) from ds237 as per size of the zip files. sub01-01.zip: contain sub-01 to sub-07 sub08-13.zip : it contains sub08 to sub-13

I have written a logic which creates a list of sub-directories [sub-01,sub-02, sub-03, sub-04, sub-05]. I have created the list so that the total size of the all sub directories in the list should not be > 5gb.

My question: is how can i write a function to zip these sub-dirs (which are in a list) into a destination zip file with proper name. Basically i want to write a function as follows:

def zipit([list of subdirs], 'path/to/zipfile/sub*-*.zip'):

I linux i generally achieve this by: 'zip -r compress/sub01-08.zip ds237/sub-0[1-8]'

3 Answers

Answers 1

Looking at https://stackoverflow.com/a/1855118/375530, you can re-use that answer's function to add a directory to a ZipFile.

import os import zipfile   def zipdir(path, ziph):     # ziph is zipfile handle     for root, dirs, files in os.walk(path):         for file in files:             ziph.write(os.path.join(root, file),                        os.path.relpath(os.path.join(root, file),                                        os.path.join(path, '..')))   def zipit(dir_list, zip_name):     zipf = zipfile.ZipFile(zip_name, 'w', zipfile.ZIP_DEFLATED)     for dir in dir_list:         zipdir(dir, zipf)     zipf.close() 

The zipit function should be called with your pre-chunked list and a given name. You can use string formatting if you want to use a programmatic name (e.g. "path/to/zipfile/sub{}-{}.zip".format(start, end)).

Answers 2

You can use subprocess calling 'zip' and passing the paths as arguments

Answers 3

The following will give you zip file with a first folder ds100:

import os import zipfile      def zipit(folders, zip_filename):     zip_file = zipfile.ZipFile(zip_filename, 'w', zipfile.ZIP_DEFLATED)      for folder in folders:         for dirpath, dirnames, filenames in os.walk(folder):             for filename in filenames:                 zip_file.write(                     os.path.join(dirpath, filename),                     os.path.relpath(os.path.join(dirpath, filename), os.path.join(folders[0], '../..')))      zip_file.close()   folders = [     "/Users/aba/ds100/sub-01",     "/Users/aba/ds100/sub-02",     "/Users/aba/ds100/sub-03",     "/Users/aba/ds100/sub-04",     "/Users/aba/ds100/sub-05"]  zipit(folders, "/Users/aba/ds100/sub01-05.zip") 

For example sub01-05.zip would have a structure similar to:

ds100 ├── sub-01 |   ├── 1 |       ├── 2 |   ├── 1 |   ├── 2 ├── sub-02     ├── 1         ├── 2     ├── 1     ├── 2 
If You Enjoyed This, Take 5 Seconds To Share It

0 comments:

Post a Comment