воскресенье, 12 сентября 2021 г.

Disable automatic labels/aliases in sqlalchemy ORM query

 Usually sqlalchemy ORM query always uses labels for colmns in queries. For example

select data from table

becomes 

select data as table_data from table

Usually this behavior is fine. But in case if needed it can disabled by subclassing and modifying Query class

sqlalchemy 1.3

from sqlalchemy.orm import Query

class QueryNoLabels(Query):
    def __iter__(self):
        """Patch to disable auto labels"""
        context = self._compile_context(labels=False)
        context.statement.use_labels = False
        if self._autoflush and not self._populate_existing:
            self.session._autoflush()
        return self._execute_and_instances(context)
session_cls = sessionmaker(bind=engine, query_cls=QueryNoLabels)
session = session_cls()
session.query() 

One such case is bug in databricks see this SO post.
Reference: 
https://stackoverflow.com/questions/55754209/why-does-sqlalchemy-label-columns-in-query/69151444#69151444 

sqlalchemy 1.4

class QueryNoLabels(Query):

_label_style = LABEL_STYLE_NONE
session_cls = sessionmaker(bind=engine, query_cls=MyQuery)
session = session_cls()
session.query()  

вторник, 26 января 2021 г.

A Single trick for docker which can greatly reduce docker image size

 Previously I shared some tips for writing better docker files, but recently I discovered one more useful hack. The next tip will reduce image size by excluding any COPY/ADD files, removed files.

FROM centos:8 as build
COPY my_file /tmp
RUN my_build


FROM centos:8

COPY --from=build / /
 

This dockerfile uses multi stage build. In first stage the actual build happens. In second - all the files are just copied to clean base image.

Also, when using thin technique there is no need to group RUNs. Keep in mind that any ENVs, VOLUMEs, CMDs will be saved from last stage only.