1.1.1.2. pymacies_arg.core module

PymaciesArg.

An extension that registers all pharmacies in Argentina.

class pymacies_arg.core.PymaciesArg(date_str: str, base_file_dir: pathlib.Path)[source]

Bases: object

Extension class for different of PymaciesArg versions.

Initilize the extension in pipeline.py:

import datetime
import os
import pathlib

from pymacies_arg import (
    PymaciesArg,
    PharmaciesLoader,
    LocationsLoader,
    DepartmentsLoader,
)

from sqlalchemy import create_engine

# this path is pointing to project/
PATH = os.path.abspath(os.path.dirname(__file__))

SQLALCHEMY_DATABASE_URI = "sqlite:///" + PATH + "db_data.db"

engine = create_engine(SQLALCHEMY_DATABASE_URI)

now = datetime.datetime.now()
date = f"{now.year}-{now.month}-{now.day}"

pymacies = PymaciesArg(date, pathlib.Path(PATH))

# Extract
file_paths = pymacies.extract_raws()

# Transform
provinces = [
    "BUENOS AIRES",
    "SANTA FE",
    "CABA",
    "TUCUMÁN",
    "MISIONES",
    "CÓRDOBA",
    "ENTRE RÍOS",
    "CHACO",
    "SALTA",
    "CORRIENTES",
    "RÍO NEGRO",
    "LA PAMPA",
    "SANTIAGO DEL ESTERO",
    "SAN LUIS",
    "SAN JUAN",
    "NEUQUÉN",
    "CHUBUT",
    "JUJUY",
    "CATAMARCA",
    "FORMOSA",
    "LA RIOJA",
    "SANTA CRUZ",
    "TIERRA DEL FUEGO",
    "MENDOZA",
]
paths = [
    pymacies.trasform_raws(file_paths, p) for p in provinces
]

# Load
for path in paths:
    PharmaciesLoader(engine).load_table(path[0])
    LocationsLoader(engine).load_table(path[1])
    DepartmentsLoader(engine).load_table(path[2])
date_str

The date on run with format YYYY-mm-dd.

Type

str

base_file_dir

A base file directory.

Type

Path

extract_raws() Dict[str, pathlib.Path][source]

Read files from source and extract the data.

Create a dataframe with the data and rewrite headers format. Save all dataframes as .csv file.

Returns

file_paths – A dict of stored data file paths.

Return type

dict[str, Path]

trasform_raws(file_paths: pathlib.Path, province: str) List[pathlib.Path][source]

Read files from source and extract the data.

Create a dataframe with the data and rewrite headers format. Save all dataframes as .csv file.

Parameters
  • file_paths (str) – The destination location.

  • province (str) – The province name in UPPERCASE.

Returns

data_paths – The destination location of data trasform.

Return type

list[Path]

pymacies_arg.core.extract_raws(date_str: str, base_file_dir: pathlib.Path) Dict[str, pathlib.Path][source]

Read files from source and extract the data.

Create a dataframe with the data and rewrite headers format. Save all dataframes as .csv file.

Parameters
  • date_str (str) – The date on run with format YYYY-mm-dd.

  • base_file_dir (Path) – A base file directory.

Returns

file_paths – A dict of stored data file paths.

Return type

dict[str]

pymacies_arg.core.trasform_raws(date_str: str, file_paths: pathlib.Path, province: str, base_file_dir: pathlib.Path) List[pathlib.Path][source]

Read files from source and extract the data.

Create a dataframe with the data and rewrite headers format. Save all dataframes as .csv file.

Parameters
  • date_str (str) – The date on run with format YYYY-mm-dd.

  • file_paths (str) – The destination location.

  • province (str) – The province name in UPPERCASE.

  • base_file_dir (Path) – A base file directory.

Returns

data_paths – The destination location of data trasform.

Return type

list[str]