Friday, September 11, 2020

Google cloud takeaways

 This blog is to documeny my experience with GCP in September 2020


I worked with 4 google services:


1. Storage

2. Functions

3. Cloud run

4. Scheduler

Before working you need to 

A. Create a project

B. Create a service account and download the json file. This is the authentication needed to access GCP’s services

C. Enable API of all the services above


1. Storage:

Inside storage you can create different buckets

Inside each bucket files are stored as objects or blobs

There are no folders, just objects with directories


2. Functions

This is simply python functions which get invoked when you call the URL of the function after it gets created


3. Cloud run

This service runs docker images that have been previously built. The steps are as follows


1. Open a cloud shell and activate a project 

    gcloud config set project [PROJECT ID]

2. Create a folder for the project

3. In the project add 

    A. Dockerfile

    B. Requirements.txt

    C. All python scripts, data, models

4. Run google’s “docker build” equivelant 

Gcloud builds submit —tag gcr.io/[PROJECT ID]/[CONTAINER NAME]


5. Deploy the container using the equivelant of “docker run”


Gcloud run deploy —image gcr.io/[PROJECT ID]/[CONTAINER NAME] —platform managed —allow-unauthenticated —memory 4G —CPU2.0


6. A link is generated which when opened the container runs


Scheduler

To call a certain service, make sure it is HTTP and that the service account is authorized to call the service using OIDC token


Lesssons learned

1. CLoud run and functions are designed for short term web requests, or maybe webscraping, but are not designed for heavy weight analytics

2. Mapping the port was a pain, Make sure that the container uses the port 8080 in the Dockerfile

3. To run Python scripts, make sure you use Flask. Specifically use app.run(host =‘0.0.0.0’, port = 8080)

4. A good architecture is to schedule the scheduler to call functions

5. The containers build on GCP are stored on Google container registry. The container can be deployed on cloud run, compute enginer or kubernetes cluster. This can be easily done using a button on top.

6. To copy data from to GCS use the command gcsutils cp gs://

7. To schedule jobs on linux, use * * * * * xx/xx/xx/python script.py   Where xx/xx/xx is the path of the python interpreter of the environment







Thursday, September 10, 2020

Schedule jobs on google cloud using functions and scheduler

 Source: https://www.youtube.com/watch?v=t7e0dNSCmzI



Create a function on functions and get the URL


On the right at info panel , click add a member

Insert the service account email

In the role choose Cloud functions invoker



Create a scheduler


Give a target as HTTP

Insert URL of the functions into URL

HTTP method is GET

Auth header is OIDC TOken

Service account is the email of the service account


That's it!


Tuesday, September 8, 2020

Deploy docker container GCP

 Source:

https://www.youtube.com/watch?v=LxHiCZCKwa8



1. Open GCP console and click on the cloud shell icon on top right [Make sure cloud shell is enabled]


2. Create a new folder

mkdir myproject


3. In the project folder, create a requirements.txt file

nano requirements.txt


4. Create a Dockerfile to build the container


nano Dockerfile


5. Use gsutil to copy files (src code, pkl files etc) from the cloud storage


6.Make the Dockerfile executable

chmod +x Dockerfile 


7. Build the dockerfile to a tag. [HOSTNAME]/[PROJECT ID]/[GIVE A NAME]

gcloud builds submit --tag gcr.io/stock-288218/mabolfadl_stk


8. Deploy the container using the command below

gcloud run deploy --image gcr.io/stock-288218/mabolfadl_stk


9. A link will appear which can be used to interact with container









Wednesday, September 2, 2020

Read google sheets in python

 Source: https://www.twilio.com/blog/2017/02/an-easy-way-to-read-and-write-to-a-google-spreadsheet-in-python.html



Follow steps in the link. Then use this code


import pandas as pd

import gspread

from oauth2client.service_account import ServiceAccountCredentials


utils_dir = '11_utils/'


# use creds to create a client to interact with the Google Drive API

#scope = ['https://spreadsheets.google.com/feeds']

scope = ["https://spreadsheets.google.com/feeds",'https://www.googleapis.com/auth/spreadsheets',"https://www.googleapis.com/auth/drive.file","https://www.googleapis.com/auth/drive"]


creds = ServiceAccountCredentials.from_json_keyfile_name(utils_dir+'sheets_api.json', scope)

client = gspread.authorize(creds)


# Find a workbook by name and open the first sheet

# Make sure you use the right name here.

sheet = client.open("live_prices").sheet1

data = sheet.get_all_records()  # Get a list of all records




df = pd.DataFrame(data)


Loud fan of desktop

 Upon restart the fan of the desktop got loud again. I cleaned the desktop from the dust but it was still loud (Lower than the first sound) ...