Il y a quelques jours cet article sur les “bobo” a été publié sur Slate et m’a interloqué. Tant le contenu que la formulation me gênent. Venant d’un des fondateurs de Slate, je m’attendais à un contenu beaucoup plus pertinent et fouillé.
Cet article en fait est un ramassis d’arguments fallacieux et d’idées préconçues assénées sans analyse. Un seul point intéressant est soulevé, en fin d’article, mais évidement pas du tout étudié en profondeur.
I like to simplify my workflow. I have plenty of aliases to speed up the execution of recurrent tasks, such as rebasing against “master” when I want to submit a patch to a project on GitHub.
For that, I set the git branch where I am hacking to track the
upstream remote, not
origin. Here is why.
I am working on a set of patches for the Apache Spark project job to ease the way to deploy complex Python program with external dependencies. One should be able to deploy job as easy as it should be, and Wheels make this job really easy.
Deployment is never a fascinating task to do, we as developer want our code to work in production exactly how it does on our machine. Python was never really good at deployment, but in recent years, it became easier and standardized to package a project, describes in a unified way its dependencies and have them installed properly with Pip, isolated inside virtualenv. It is however not obvious at first sight for non-Pythonista experts and there are several tasks to do to make everything automatic for Python package developer, and so, for a PySpark developer as well.
I describe in this blog post some thoughts on how PySpark should allow users to deploy Python applications and no more simple Python scripts, by handling Wheels and isolated virtual environments.
The main idea behind this proposal is to let developers handle the Python environment to deploy on executors instead of being jailed by what is actually installed on the Spark’s Executors Python envionment. If you agree with this approach, please add a comment in the JIRA ticket for speeding up the integration inside Spark 2.x soon.
I currently have 6 pull requests on Apache Spark projet, mostly code housekeeping on PySpark module… some opened since 2 months !
This post describes an idea for a data processing framework built on Python for Data Processing project inspirited by state-of-the-art actor systems such as Akka. It is a bit like a restricted version of a lambda architecture.
This could be used in ETL, data extraction or any custom warehouse process, where data are pushed on pulled from one side, need some obscure processing which can involve getting more data from somewhere else, and then stored in a database or storage area.
I don’t have a name for this framework, I like how “Akka” is a small palindrome name. Maybe I’ll find a nice palindrome name in a near future.
I’ll start with a high overview of the Lambda Architecture and Actor Model where I found some inspiration and then describe the variation I would like my system to be from this model.
I have just released a new version of Guake: Guake 0.8.6. It’s mostly bug fixes, and a rework on the way the main window placement is done. Hope it will work better than previous version. Thanks for the various contributors! Here…
I am learning Scala.
I am sad to see that, again, like OCAML, like Rust, like CoffeeScript, Scala does not like the
return statement. It is implicit as the evaluation of the latest expression in a block.
Presentation Guake is a top-down terminal for Gnome, and is highly inspirated by the terminal used in a famous FPG game. It is intended to be lightning fast to reach and interact with it in the developer or administrator everyday life….
I am not really conviced by YAML over JSON. For me, it has the following advantages: Easy to read information is not lost in a sea of “<” and “>” like with XML I actually don’t like abiguity in json…
I never liked the way ‘stash’ works in Git. It is not really obvious to use, and moreover, it is a nightmare when a conflict arise during a rebase.
This article discuss about the stash concept in Git, its advantages and disadvantages and why I recommend not to use it at all.