воскресенье, 12 сентября 2021 г.

Disable automatic labels/aliases in sqlalchemy ORM query

 Usually sqlalchemy ORM query always uses labels for colmns in queries. For example

select data from table

becomes 

select data as table_data from table

Usually this behavior is fine. But in case if needed it can disabled by subclassing and modifying Query class

sqlalchemy 1.3

from sqlalchemy.orm import Query

class QueryNoLabels(Query):
    def __iter__(self):
        """Patch to disable auto labels"""
        context = self._compile_context(labels=False)
        context.statement.use_labels = False
        if self._autoflush and not self._populate_existing:
            self.session._autoflush()
        return self._execute_and_instances(context)
session_cls = sessionmaker(bind=engine, query_cls=QueryNoLabels)
session = session_cls()
session.query() 

One such case is bug in databricks see this SO post.
Reference: 
https://stackoverflow.com/questions/55754209/why-does-sqlalchemy-label-columns-in-query/69151444#69151444 

sqlalchemy 1.4

class QueryNoLabels(Query):

_label_style = LABEL_STYLE_NONE
session_cls = sessionmaker(bind=engine, query_cls=MyQuery)
session = session_cls()
session.query()  

вторник, 26 января 2021 г.

A Single trick for docker which can greatly reduce docker image size

 Previously I shared some tips for writing better docker files, but recently I discovered one more useful hack. The next tip will reduce image size by excluding any COPY/ADD files, removed files.

FROM centos:8 as build
COPY my_file /tmp
RUN my_build


FROM centos:8

COPY --from=build / /
 

This dockerfile uses multi stage build. In first stage the actual build happens. In second - all the files are just copied to clean base image.

Also, when using thin technique there is no need to group RUNs. Keep in mind that any ENVs, VOLUMEs, CMDs will be saved from last stage only.

понедельник, 31 августа 2020 г.

Migrating centos VM from VirtualBox or VMWare to hyper-v

 In order to migrate centos to hyper-v, there is need to include hyper-v drivers to initramfs. The next command will do the task:

dracut -f --add-drivers 'hv_vmbus hv_storvsc hv_netvsc hv_utils hv_balloon hyperv-keyboard hyperv_fb hid-hyperv'

Or else system will fail to detect the disk and will fail to boot. If you see the emergency shell then it is the case. 
Possible error messages:
Timed out waiting for device ...
Warning: /dev/disk/... does not exist

 Tested on centos 8
 

вторник, 19 мая 2020 г.

Sequential multiple builds with packer

Packer is a great tool to prepare and build virtual machines, but it lacks a feature to build multiple output formats from the same VM in a single run.
For example vagrant, ova, and vhd in one run.
Fortunately, this can be accomplished by writing multiple packer.json files. Which is done the way, that initial VM is provisioned once in the first step, and then kept running. All consequential steps just attach to existing VM and export in desired format, possibly with extra provisioning (vagrant).

The key is to keep VM until the last step. This can be done with "virtualbox-vm" builder and
"keep_registered": true
# content of build.sh
packer setup-vm.json # import ova, or build from iso
packer export-ova.json # export and keep vm
packer export-vhd.json # still keep
packer make-vagrant.json # export and destroy VM

First json: 
 "builders": [
    {
      "type": "virtualbox-ovf",
      "source_path": "base.ova",
      "vm_name": "build-vm",
      "keep_registered": true // this is important
      ...
    },
  ]

Last json: 
"builders": [
  {
    "type": "virtualbox-vm",
    "vm_name": "build-vm", // same vm name
    "keep_registered": false // finally destroy vm
    ...
  }
],

пятница, 8 мая 2020 г.

Tips for writing dockerfiles

Sharing here some insights I got while working on reducing size of the image

Docker images Copy-On-Write based, and new layer created for each RUN command. So, that means clean up step must be in the same RUN as the main operation

Wrong:
RUN apt-get install my-package # will add a package and also cache to layer
RUN apt-get clean # will remove cache from resulting image, but will not reduce the size

Right:
RUN apt-get install my-package && apt-get clean # will add only package itself

docker history command can be used to show a detailed breakdown of the size of layers.


When there is a lot going on in dockerfile grouping all to singe RUN reduces readability a lot. To help with that you can split commands over several RUNs and perform cleanup after each of them.

# installing build dependencies
RUN apt-get install build_dep1 build_dep2 && apt-get clean

# installing runtime dependencies
RUN apt-get install runtime_dep1 runtime_dep2 && apt-get clean



Or use inline comments in single RUN.

RUN apt-get install \

`# installing build dependencies` \
  build_dep1 `# required for ...` && \
  build_dep2 `# also required for build` && \
`# installing runtime dependecies` \ 
 runtime_dep1 `# needed for this` \
 runtime_dep2 `# needed for that` && \
 apt-get clean `# clean up`


Note that I commented why each dependency is required. This will help a lot when it's time to migrate to a new base image. Like update from ubuntu 16 to ubuntu 18

среда, 15 мая 2019 г.

View CSV from browser on MacOS

To quickly view the content of CSV file from browser it is possible to use builtin mac os file preview.
This will work for any browser. Tested on firefox.

Create an application in Automator
  • open Automator
  • add shell script
  • paste code qlmanage -c public.plain-text -p "$@" 1>/dev/null 2>/dev/null
  • set pass input: as arguments
  • save as preview-text app to Applications
  • open csv file in browser
  • select open with
  • choose preview-text app

суббота, 6 апреля 2019 г.

Unvolume in docker

Sometimes there is a need to use base image, but without any defined volumes. For example store some test data to database as image to easily revert it.
Currently and any time soon docker do not support removing volumes, ports and other image properties.
There is no UNVOLUME and UNEXPOSE and UNSETENV.

Several workarounds exists.

Edit image tar file

Just remove volumes from metadata.
There even tool for that
https://github.com/gdraheim/docker-copyedit

multistage build

Copy content of original image, without metadata
Use next dockerfile


# Dockerfile
FROM mysql as orig

FROM ubuntu:bionic as image
# care must be taken, this will not preserve fs ownership
COPY --from=orig / / # this will copy all files, without metadata.

ENV ... # include all commands that are not file related

ENTRYPOINT ["docker-entrypoint.sh"]

EXPOSE 3306
CMD ["mysqld"]