Monday, August 5, 2019

File Automation in S3 Bucket AWS with Lambda Function

Master Css Reader From S3

File Automation in S3 Bucket AWS with Lambda Function

Problem:

  • all the files are dumped usually in S3 Bucket is there a way we can Schedule Automation in this like all Image File goes inside Folder known as Image and all PDF Inside Folder. well yes this code Does exactly that it creates Folder and Puts alll the File in respective Folder

Github: https://github.com/soumilshah1995/File-Automation-in-S3-Bucket-AWS-with-Lambda-Function

Soumil Nitin Shah

Bachelor in Electronxic Engineering | Masters in Electrical Engineering | Master in Computer Engineering |

Hello! I’m Soumil Nitin Shah, a Software and Hardware Developer based in New York City. I have completed by Bachelor in Electronic Engineering and my Double master’s in Computer and Electrical Engineering. I Develop Python Based Cross Platform Desktop Application , Webpages , Software, REST API, Database and much more I have more than 2 Years of Experience in Python

In [7]:
import boto3


class FileSorter(object):
    
    def __init__(self, BucketName='soumilnitinshah832019'):
        self.BucketName=BucketName
        self.client = boto3.client('s3')
        self.response = self.client.list_objects(Bucket=self.BucketName)
    
    @property
    def sort(self):
        
        data = []
        
        # Response to Get all Objects
        for x in self.response.get("Contents", None):
            data.append(x.get("Key", None))

        
        #Iterate Over Each File 
        for x in data:
            print(x)
            
            if ('.jpg' in x) or ('.png' in x) or ('.jpeg' in x):
                
                # For Each File Get the Bytes Data
                response_new = self.client.get_object(Bucket=self.BucketName, Key=x)

                # Read the Bytes Data
                MyData = response_new["Body"].read()

                # Move That File into Folder called Image 
                response = self.client.put_object(ACL='private',
                                             Bucket='soumilnitinshah832019',
                                             Body=MyData,
                                             Key='Images/{}'.format(x))

                # Delete that File Outside Directory 
                response = self.client.delete_object(Bucket=self.BucketName,Key=x)


            if ('.csv' in x):
                
                # For Each File Get the Bytes Data
                response_new = self.client.get_object(Bucket=self.BucketName, Key=x)

                # Read the Bytes Data
                MyData = response_new["Body"].read()

                # Move That File into Folder called Image 
                response = self.client.put_object(ACL='private',
                                             Bucket='soumilnitinshah832019',
                                             Body=MyData,
                                             Key='CSV/{}'.format(x))

                # Delete that File Outside Directory 
                response = self.client.delete_object(Bucket=self.BucketName,Key=x) 


            if ('.xlsx' in x):
                
                # For Each File Get the Bytes Data
                response_new = self.client.get_object(Bucket=self.BucketName, Key=x)

                # Read the Bytes Data
                MyData = response_new["Body"].read()

                # Move That File into Folder called Image 
                response = self.client.put_object(ACL='private',
                                             Bucket='soumilnitinshah832019',
                                             Body=MyData,
                                             Key='Excel/{}'.format(x))

                # Delete that File Outside Directory 
                response = self.client.delete_object(Bucket=self.BucketName,Key=x)


            if ('.pdf' in x):
                
                # For Each File Get the Bytes Data
                response_new = self.client.get_object(Bucket=self.BucketName, Key=x)

                # Read the Bytes Data
                MyData = response_new["Body"].read()

                # Move That File into Folder called Image 
                response = self.client.put_object(ACL='private',
                                             Bucket='soumilnitinshah832019',
                                             Body=MyData,
                                             Key='PDF/{}'.format(x))

                # Delete that File Outside Directory 
                response = self.client.delete_object(Bucket=self.BucketName,Key=x) 

            if ('.mp4' in x):
                
                # For Each File Get the Bytes Data
                response_new = self.client.get_object(Bucket=self.BucketName, Key=x)

                # Read the Bytes Data
                MyData = response_new["Body"].read()

                # Move That File into Folder called Image 
                response = self.client.put_object(ACL='private',
                                             Bucket='soumilnitinshah832019',
                                             Body=MyData,
                                             Key='Video/{}'.format(x))

                # Delete that File Outside Directory 
                response = self.client.delete_object(Bucket=self.BucketName,Key=x) 
        
        print("Done ")
                
                
if __name__ == "__main__":
    obj = FileSorter()
    obj.sort
monthly-milk-production.csv
Done 
In [ ]:
 
In [ ]:
 
In [ ]:
 
In [ ]:
 
In [ ]:
 
In [ ]:
 

No comments:

Post a Comment

Learn How to configure your Spark Session to Join Managed (S3 Table Buckets) and Unmanaged Iceberg Tables | Hands on Labs

test-tble-bucket-joins Learn How to configure your Spark Session to Join Managed (S...