Python auf Android: Es lebt!

Posted by Russell Keith-Magee on 29 February 2020

In den letzten Monaten hatten wir einen Auftragnehmer, (Asheesh Laroia) der daran gearbeitet hat, die Android Unterstützung für die BeeWare Werkzeugsammlung wiederherzustellen.

Ich bin sehr froh mitteilen zu können, dass wir unseren ersten großen Meilenstein erreicht haben: Eine funktionierende, reine Python App, die auf einem Android Gerät läuft!

Das ist nicht das Ende des Projekts - es ist erst der Anfang. Es gibt immer noch viel Feintuning zu machen (Insbesondere was die Größe der Support Bibliotheken betrifft), und wir müssen diese Unterstützung in Briefcase und Toga integrieren.

Trotzdem können Abenteuerlustige Asheeshs work-in-progress Entwicklung ausprobieren. Seine Python Android Support repository enthält den aktuellen Stand der Entwicklungen, und enthält ziemlich gute Anweisungen um loszulegen. Momentan benötigt man noch zumindest ein bisschen Wissen über native Android Programmierung um dieses Repository voll nutzen zu können; aber wenn man die Resultate aus dem Video wiederholen möchte, sollte dieses Repository (und die darin verlinkten Repositorys) alles bereitstellen was man braucht.

Ein großes Dankeschön geht mal wieder an die Python Software Foundation. Ohne ihre finanzielle Unterstützung, wäre diese Arbeit immer noch nur ein Konzept. Dieses Projekt ist nur eine von vielen Arten, auf die PSF Nutzer Spenden die Python Gemeinschaft und das Python Ökosystem untertützen. Wenn Ihre Firma Python in irgendeiner Art und Weise nutzt, empfehle ich Ihnen sehr die PSF finanziell zu unterstützen, damit sie Projekte wie dieses weiterhin unterstützen kann.

Ein großes Dankeschön gebührt auch Asheesh. Ohne seine herrausragenden Fähigkeiten, Liebe zum Detail, und seine Begeisterung für Compilerfehler, hätten wir nicht den unglaublich schnellen Fortschritt erzielt, den wir jetzt beobachten können.

Freut euch bald auf weitere Ankündigungen!

Wir haben einen Auftragnehmer für unseren Android-Vertrag!

Posted by Russell Keith-Magee on 26 November 2019

Vor einigen Monaten haben wir angekündigt, dass das BeeWare-Projekt von der PSF einen Zuschuss erhalten hat, um unsere Unterstützung für Android zu verbessern. Damals riefen wir Auftragnehmer auf, uns bei der Durchführung dieser Arbeit zu unterstützen.

Wir freuen uns sehr, euch mitteilen zu können, dass wir jetzt einen Auftragnehmer ausgewählt haben: Asheesh Laroia.

Asheesh ist ein regelmäßiger Referent bei Python-Veranstaltungen, wo er sich in eine Reihe von detaillierten und komplexen Themen vertieft hat. Er hat uns auch mit der Liste unkonventioneller technischer Integrationsprojekte beeindruckt, an denen er in professioneller und gelegentlicher Tätigkeit mitgewirkt hat.

Auf die Frage, warum er sich für eine Zusammenarbeit mit BeeWare für diesen Auftrag beworben hat, sagte Asheesh: "Ich benutze jeden Tag ein Android-Telefon und fühle mich geehrt, an der Umsetzung der BeeWare-Vision mitwirken zu können, Python zur Erstellung erstklassiger, nativer Anwendungen einzusetzen.

Asheesh wird seine Arbeit Mitte Dezember aufnehmen, und wenn alles gut läuft, sollten wir ab Mitte bis Ende Februar signifikante Ergebnisse sehen. Wenn du den Fortschritt verfolgen möchtest, kannst du BeeWare auf Twitter verfolgen; wir werden auch größere Updates auf diesem Blog veröffentlichen.

BeeWare-Projekt erhält PSF-Bildungszuschuss

Posted by Russell Keith-Magee on 25 September 2019

Das BeeWare-Projekt will es allen Python-Entwicklern ermöglichen, native Anwendungen für Desktop- und mobile Plattformen zu schreiben. Wir haben solide Unterstützung für die meisten Desktop-Betriebssysteme und iOS, aber wir wissen, dass unsere Android-Unterstützung fehlt. Das BeeWare-Kernteam weiß, was getan werden muss, um das Problem zu lösen - was uns bisher gefehlt hat, sind Zeit und Ressourcen.

Dank der PSF-Bildungszuschussgruppe ist das kein Thema mehr. Wir haben einen Zuschuss von 50'000 US-Dollar erhalten, um die Android-Unterstützung von BeeWare auf ein Niveau zu bringen, das mit unserer iOS-Unterstützung vergleichbar ist. Da wir derzeit nicht die Zeit haben, die Arbeit selbst zu erledigen, rufen wir Auftragnehmer auf, uns bei der Bereitstellung dieser Unterstützung zu helfen.

Es handelt sich um einen bezahlten Vertrag, der voraussichtlich 3-6 Monate dauern wird (abhängig von der Erfahrung des gewählten Auftragnehmers). Du musst auch nicht in den USA oder Europa ansässig sein; die Möglichkeit steht allen offen, die die Anforderungen des Vertrags erfüllen.

Leider erfordert diese Aufgabe einige anspruchsvolle Fähigkeiten, und wir sind nicht in der Lage, eine umfassende Betreuung anzubieten. Für eine erfolgreiche Bewerbung werden voraussichtlich einige Erfahrungen und eine Vorgeschichte mit den betreffenden Technologien erforderlich sein.

Eine vollständige Rollenbeschreibung und Arbeitsumfang für den Vertrag ist verfügbar. Um dein Interesse zu bekunden, sende bitte deinen Lebenslauf und dein Bewerbungsschreiben an contracts@beeware.org.

Wir freuen uns darauf, in naher Zukunft volle Unterstützung für Android ankündigen zu können!

2018 Google Summer of Code - VOC Optimization

Posted by Patience Shyu on 14 August 2018

Google Summer of Code is coming to an end. I've spent the summer working on optimizing the VOC compiler, and I’m super excited to share the results.

Results

There are a couple of ways to evaluate the performance improvement from my project.

Microbenchmarks

Firstly, we introduced a microbenchmarking suite. Each microbenchmark is a small piece of Python code that tests a single and specific Python construct, or datatype, or control flow. The benchmarking infrastructure itself is crude (essentially it just tells you the total amount of processor time it took to run, with no fancy statistics) but it has been extremely useful to me while working on performance features to verify performance gain.

The idea is that the benchmarking suite is not to be run as part of the full test suite, but rather as needed and manually whenever an optimization is implemented. It also provides a way to check and prevent performance regression, especially on the "optimized" parts of VOC. While it doesn't really make sense to record specific numbers, as they will always vary from machine to machine, it should be reasonably easy to compare two versions of VOC. Benchmark numbers are included on each optimization-related PR I've worked on this summer (see PR log below), and I hope that more benchmarks will be added as more performance efforts are carried out in the future.

Pystone

Pystone is a Python Dhrystone, a standard benchmark for testing the performance of Python on a machine. Here are the before and after results on my machine:

May 10th, 2018:

$ python setup.py test -s tests.test_pystone test_pystone (tests.test_pystone.PystoneTest) ... Pystone(1.2) time for 50000 passes = 101.833 This machine benchmarks at 490.998 pystones/second

$ python setup.py test -s tests.test_pystone test_pystone (tests.test_pystone.PystoneTest) ... Pystone(1.2) time for 50000 passes = 101.298 This machine benchmarks at 493.595 pystones/second

$ python setup.py test -s tests.test_pystone test_pystone (tests.test_pystone.PystoneTest) ... Pystone(1.2) time for 50000 passes = 102.247 This machine benchmarks at 489.014 pystones/second

On current master (Aug 14th, 2018):

$ python setup.py test -s tests.test_pystone test_pystone (tests.test_pystone.PystoneTest) ... Pystone(1.2) time for 50000 passes = 11.2300 This machine benchmarks at 4452.37 pystones/second

$ python setup.py test -s tests.test_pystone test_pystone (tests.test_pystone.PystoneTest) ... Pystone(1.2) time for 50000 passes = 10.9833 This machine benchmarks at 4552.36 pystones/second

$ python setup.py test -s tests.test_pystone pystone (tests.test_pystone.PystoneTest) ... Pystone(1.2) time for 50000 passes = 10.9498 This machine benchmarks at 4566.29 pystones/second

Conclusions

Some things that I learned about VOC while working on this project:

1. Object creation in the JVM is expensive. This definitely does not mean that the VOC user writing Python should think about minimizing the number of objects that she creates, but rather that any time we can non-trivially reduce the number of objects created during bytecode transpilation or in VOC-defined function calls, we can expect to see a huge performance boost. Integer and boolean preallocation, which is about reusing objects that have already been created, was one of the most significant improvements we made this summer.

2. Method calls in VOC are expensive. This is essentially due to the process of invoking a callable: you have to check that the method is defined on the object, then construct it (read: object creation!), and check the arguments, before it can actually be called. (This is done using reflection, which is super interesting and confusing in itself.) And this is the reason why refactoring the Python comparison functions made such a big performance impact, because we were able to circumvent this process.

3. Exception-heavy code is expensive. Again, this is not to say that the programmer is on the hook for being frugal when throwing exceptions, but that VOC benefits greatly by avoiding the use of exceptions internally except when strictly necessary. For instance, Python uses StopIteration exceptions to signal the end of a for loop, and they quickly rack up when you have nested loops (everything is ultimately related to object creation!). That was the motivation for the nested loops optimization.

If I may be a bit more reflective here, one of the a-ha! moments I had this summer was realizing that to really optimize something, you have to understand where its biggest problems are first. I remember pitching to Russ at the start of the summer things like loop unrolling, constant folding, even converting to SSA-form (you know, stuff I heard about optimzation in my compilers class) and he was saying to me, think simpler. While working on my project, I used a profiler to understand exactly which parts of VOC were slow, and that information drove the changes we implemented. I think it worked out pretty well!

Future Work

  • Minimize boxing of primitive types like String and Int. As VOC is written half in Python, half in Java, a single integer can be found in various representations on its way through the compiler -- as a Python object, unboxed to a primitive Java int, then packaged back up to a Python object. This problem was (somewhat incoherently) addressed in my proposal, but ultimately we couldn't come up with a good abstraction to support it.
  • Build a peephole optimizer. CPython's peephole optimizer scans generated bytecode to identify sequences of bytecode that can be optimized, VOC could benefit from this too.
  • Hook up more benchmarks, which serve as both proof of the kinds of programs VOC can currently compile and areas ripe for performance improvement.

Thank you

I will wrap this up by giving big thanks to Russ, my mentor. The time you spent helping me form my ideas, patiently answering my questions and reviewing my work was invaluable to me. It couldn't have been easy keeping up with what I was doing especially since I started improvising halfway through the summer. I am so grateful for your help, thank you.

2018 Google Summer of Code - Implement asyncio support in VOC

Posted by Yap Boon Peng on 11 August 2018

In the blink of an eye, Google Summer of Code (GSoC) 2018 has come to an end. During the three months long coding period, I have contributed several patches in VOC repository of BeeWare, all working towards the ultimate end goal of running asyncio module in VOC. In this blog post (which is my first actual blog post by the way 😄), I will document what I have done so far, why I couldn't make it to the end goal (yea, unfortunately I couldn't get asyncio to work at the end of GSoC 2018), and what's left that needs to be done in order to achieve the end goal (or at least make part of asyncio work).

Building Foundation

The first error that the transpiler throws when attempting to compile asyncio module was "No handler for YieldFrom", so it makes sense to start from this issue first.

Another feature related to generator was Yield expression. Before GSoC 2018, Yield statement in VOC was just a statement, meaning yield could not be used as expression. Generator methods such as generator.send, generator.throw and generator.close were not supported as well. Those features are what make asynchronous programming with generator possible, so I spent a few weeks to extend generator functionality in VOC, laying down the path to asyncio module.

PRs related to generator are listed below:

  • PR #821 : Added support for Yield from statement (merged)
  • PR #823 : Added generator send method (merged)
  • PR #831 : Support exceptions handling in generator (merged)

Nonlocal Statement

Nonlocal statement was another syntax not supported by VOC. After completion of generator's features, implementing this is the next step towards compiling asyncio module.

Implementing this feature took about 3 ~ 4 weeks as this is not as trivial as it seems. I took several approaches on this, while some of them do work, the code is not pretty and hacky, which could come back to bite me/other contributors in the long run. After many discussions with Russell, I refactored the closure mechanism in VOC and took a much cleaner approach in nonlocal implementations. I must admit that I took some short-cuts for the sake of "making nonlocal works" in the process of implementing nonlocal statement, resulting in poor design and messy codes. Many thanks to Russell, who helped me to improve my coding style and told me not to be discouraged when I'm stuck. 😄

Related PRs:

  • PR #854 : Nonlocal statement support (merged)
  • PR #873 : Added closure related test cases (merged)

The Collections Module

Next item on my hit list was pure Java implementations of the collections module. asyncio module depends on 3 data structures from collections, namely defauldict, Deque and OrderedDict. Two of them (defaultdict and Deque) are implemented in C in CPython, plus they have good analog in Java, so it makes senses to implement the module in Java. Porting defauldict, Deque and OrderedDict to Java in VOC is relatively straight-forward, taking about 1.5 weeks to complete.

Related PRs:

  • PR #874 : Implement collections.defauldict (merged)
  • PR #896 : Implements collections.Deque (merged)
  • PR #897 : Implements collections.OrderedDict` (merged)

Other PRs submitted during GSoC 2018

  • PR #817 : Added coroutine related exception class [WIP] (closed due to not needed)
  • PR #836 : Changed Bool construction to use getBool instead (merged)
  • PR #847 : Add custom exceptions test cases (closed due to more comprehensive handling in PR #844)
  • PR #849 : Fixed Unknown constant type <class 'frozenset'> in function definition (merged)
  • PR #858 : Added test case for Issue #857 (merged)
  • PR #860 : Added test case for Issue #859 (merged)
  • PR #862 : Added test case for Issue #861 (merged)
  • PR #867 : Fixed Issue #866 RunTimeError when generator is nested in more than 1 level of function definition (merged)
  • PR #868 : Fixed Issue #861 Redefining nested function from other function overrides original nested function (merged)
  • PR #879 : Fixed Incompatible Stack Height caused by expression statement (merged)
  • PR #901 : Added test case for Issue #900 (merged)
  • PR #788 : Implements asyncio.coroutines [WIP] (open, the dream 😎)

Issues submitted during GSoC 2018

  • Issue #861 : Redefining nested function from other function overrides original nested function (fixed in PR #868)
  • Issue #866 : RunTimeError when generator is nested in more than 1 level of function definition (fixed in PR #867)
  • Issue #828 : Finally block of generator is not executed during garbage collection (open)
  • Issue #857 : Complex datatype in set cause java.lang.StackOverflowError (open)
  • Issue #859 : Duplicated values of equivalent but different data types in set (open)
  • Issue #900 : Exception in nested try-catch suite is 'leaked' to another enclosing try-catch suite (open)
  • Issue #827 : Maps reserved Java keywords to Python built-in function/method call (closed)

Towards The Ultimate End Goal

Unfortunately, three months of GSoC coding period was not enough for me to bring asyncio module to VOC. The nonlocal statement implementation was the biggest blocker for me mainly because I didn't think thoroughly before writing code. If I were to plan carefully and lay out a general coding direction, I would've completed it in much shorter time and have time for other implementations. An advice for the aspiring and upcoming GSoC-er, don't rush your code, make sure you know 100% about what you're doing before diving into the codes.

With that said, following are the list of modules to be implemented/ported to Java before asyncio will work in VOC:

  • socket module (a bit tricky since Java doesn't support Unix domain socket natively)
  • selectors module (high level I/O operations)
  • threading module (might be easier to implement this first since threading in Python is an emulation of Java's Thread)
  • time module (partially implemented in VOC)

Final Thoughts

Huge thanks to my mentor, Russell Keith-Magee for accepting my proposal, providing guidance and encouraging me when things didn't go as intended. It is truly an honor to be a part of the BeeWare community. I had a blast contributing to BeeWare project, and I'm sure I will stick around as a regular contributor. Also shout out to the BeeWare community for answering my queries and reviewing my pull requests. 😄

Project Spotlight: Colosseum

Posted by Russell Keith-Magee on 6 October 2017

This article was originally published on the BeeWare Enthusiasts mailing list. If you'd like to receive regular updates about the BeeWare project, why not subscribe?

When you're designing a GUI app - be it for desktop, mobile, or browser - one of the most fundamental tasks is describing how to lay widgets out the screen. Most widget toolkits will use a grid or box packing model of some kind to solve this problem. These models tend to be relatively easy to get started, but rapidly fall apart when you have complex layout needs - or when you have layouts that need to adapt to different screen sizes.

Instead of inventing a new grid or box model, the Toga widget toolkit takes a different approach, using a well known scheme for laying out content: Cascading Style Sheets, or CSS. Although CSS is best known for specifying layout in web pages, there's nothing inherently web specific about it. At the end of the day, it's a system for describing the layout of a hierarchical collection of content nodes. However, to date, every implementation of CSS is bound to a browser, so the perception is that CSS is a browser-specific standard.

That's where Colosseum comes in. Colosseum is a browser independent implementation of a CSS rendering engine. It takes a tree of content "nodes" - such as a DOM from a HTML document - an applies CSS styling instructions to layout those nodes as boxes on the screen. In the case of Toga, instead of laying out <div> and <span> elements, you lay out Box and Button objects. This allows you to specify incredibly complex, adaptive layouts for Toga applications.

But Colosseum as a project has many other possible uses. It could be used anywhere that there is a need for describing layout outside a browser context. For example, Colosseum could be the cornerstone of a HTML to PDF renderer that doesn't require the involvement of a browser. It could also be used as a test harness and reference implementation for the CSS specification itself, providing a lightweight way to encode and test proposed changes to the specification.

The current implementation is based on Facebook's yoga project - it was originally a line-for-line port of yoga's javascript codebase into Python. However, yoga only implements the Flexbox portion of the CSS3 specification.

This week, we started a big project: rewriting Colosseum to be a fully standard-compliant CSS engine. The work so far can be found in the globe branch of the colosseum repository on Github. The first goal is CSS2.1 compliance, with an implementation of the traditional CSS box model and flow layout. Once we've got a reasonable implementation of that, we'll look to adding Grid and FlexBox layouts from the CSS3 specification set.

This is obviously a big job. CSS is a big specification, so there's a lot of work to be done - but that also means there's lots of places to contribute! Pick a paragraph of the CSS specification, build some test cases that demonstrate the cases described in that paragraph, and submit a patch implementing that behaviour!

It also highlights why your financial support is so important. While we could do this entirely with volunteered effort, we're going to make much faster progress if a small group of people could focus on this project full time. Financial support would allow up to significantly ramp up the development speed of Colosseum, and the rest of the BeeWare suite.

If you would like to see Colosseum and the rest of BeeWare develop to the point where it can be used for commercial applications, please consider supporting BeeWare financially. And if you have any leads for larger potential sources of funding, please get in touch.

2017 Google Summer of Code - Port Cricket to use Toga, instead of Tkinter

Posted by Dayanne Fernandes on 25 August 2017

After almost 4 months of work on Google Summer of Code 2017, finally I'm completing my proposal. Every widget migration and every commit/PR/issue/discussion with my mentors about Cricket , Toga and rubicon-objc were detailed on the Issue 58.

"Eating your own dog food"

The best way to show that a product is reliable to the customers is use it. So, the way to show that Toga is an effective tool to build a GUI is to build a complete application using it.

Cricket is a graphical tool that helps you run your test suites. Its current version is implemented using Tkinter as the main GUI framework. So, why not test Toga inside of another product from BeeWare? That's what I have acomplished during my GSoC work.

Results

The proposal focus not only on the port of Tkinter to Toga, but on mapping the necessary widgets for a real application using Toga framework. To help me to map this I studied more about Tkinter, Toga, Colosseum, rubicon-objc, Objective-C, Cocoa and CSS.

The work I did during GSoC were sent throught the PR 65, reported on the Issue 58 and the final demonstration of the work can be seen in this link. There were widgets used on Cricket that weren't ready yet on Toga, so some improvements were necessary on Toga so that I could use them on Cricket. In summary here are some PRs and issues that I contributed to get my work done in Cricket:

Open PR that I sent to Toga:

  • PR 201 : [Core][Cocoa] Refactoring of the Tree widget

Merged PRs that I sent to Toga:

  • PR 112 : [Core][Cocoa] Enable/disable state for buttons, solved Issue 91
  • PR 170 : [Cocoa] Content and retry status for stack trace dialog
  • PR 172 : [Cocoa] Window resize
  • PR 173 : [Core][Cocoa] Button color
  • PR 174 : [Doc] Examples folder and button features example
  • PR 178 : [Doc] Fix tutorial 2 setup
  • PR 180 : [Doc] Update Toga widgets roadmap
  • PR 182 : [Cocoa] Update the label of the Stack trace button for critical dialog
  • PR 184 : [Core][Cocoa] Hide/show boxes widget
  • PR 188 : [Cocoa] Fix error on MultilineTextInput widget, solved Issue 187
  • PR 204 : [Core][Cocoa] Clear method to MultilineTextInput widget, solved Issue 203
  • PR 206 : [Core][Cocoa] Readonly and placeholder for MultilineTextInput widget
  • PR 208 : [Cocoa] Fix apply style to a SplitContainer widget, solved Issue 207

Merged PR that I sent to Cricket:

Merged PR that I sent to rubicon-objc:

  • PR 34 : [Doc] Add reference to NSObject

Open issues that I sent to Toga:

  • Issue 175 : [Core] Add more properties for Label and Font widgets
  • Issue 176 : [Core] Add "rehint()" on the background of the widget after changing font size
  • Issue 186 : [Core] Set initial position of the divisor of a SplitContainer
  • Issue 197 : [Core] Get the id of the selected Tab View on the OptionContainer

Closed issues that I reported to Toga:

Closed issues that I didn't reported but I solved on Toga:

  • Issue 91 : API to disable buttons?
  • Issue 205 : adding MultiviewTextInput results in TypeError

Closed issue that I reported to Cricket:

  • Issue 59 : Run selected doesn't count/ runs every test selected in a test module, was fixed by me

Open issue that I reported to rubicon-objc Jonas Obrist repository:

  • Issue 1 : Seg Fault when iterate through a NSIndexSet using block notation

Future Plans

There are some features on Cricket that I want to help develop in a near future, for example:

  • A button to refresh all the tests tree
  • Cricket settings

Also, there are some issues remained after this migration to Toga. These issues will be fixed on Toga widgets in a near future too, for example:

  • A gap between the output and error boxes when there is no output message
  • Run a test if the user click on it

I truly believe that Toga will be the oficial framework on Python to build GUI for multiplatforms applications, so I'll continue to contribute to this project because I want to use in every application that I would need a GUI.

Final Considerations

I would like to truly thank my mentors Russell Keith-Magee and Elias Dorneles for guide and help me so much during this period. The opportunity to be part of this community was a great honor to me, thank you so much to accept me in this program Russell Keith-Magee. Also, I want to thank Philip James that made some reviews in my PRs and Jonas Schell that fixed one issue that I sent to Toga.

2017 Google Summer of Code - Batavia improvements

Posted by Adam Boniecki on 23 August 2017

With Google Summer of Code 2017 program nearing its end, it is time to summarize what I got done during the summer working on Batavia.

Batavia is a part of BeeWare's collection of projects. As it is still in its early stage of development, for my part I offered to implement a number of features missing from Batavia, ranging from elemental data types, through JSON manipulation and language constructs such as generators. I posted my proposal in this GitHub thread and kept it updated with my progress on a weekly basis.

Note that by the end of GSoC, we have decided to diverge from the proposal and forgo implementation of contextlib in favor of support for Python 3.6 2-byte wide opcodes.

Overall it was great learning experience and fun. Big thanks to my mentors Russell Keith-Magee and Katie McLaughlin, and the whole BeeWare community.

2017 Google Summer of Code - Testing Toga/Settings API

Posted by Jonas Schell on 22 August 2017

Google Summer of Code 2017 is coming to an end. After three month of working on the BeeWare project, I would like to summarise my work and share my experiences.

“No Battle Plan Survives First Contact.”

This was one of the first things Russell said to me after we decided to fundamentally restructure my proposed GSoC timetable and goals. During the community bounding period we discovered that Toga was even harder to test as we expected. The tight coupling between the platform independent Toga-Core part and the platform dependent implementations for (Windows, macOS, iOS, Linux, Android, Django, ...) was giving us a hard time to write meaningful tests.

We expect Toga to become a decently sized project, therefore we want it to have a solid and well tested base. Given that, we decided that I would spend most of GSoC to restructure Toga to make it more testable. Besides that, I would also add implementation tests to check if a given backend is implemented in the right way. If I would finish before the end of the summer I would just start with my original project proposal.

The Big Restructure of Toga

With the new goals and a fresh branch I started the journey to restructure the Toga project to make it more testable.

After hacking around and testing different things on a separate branch. I identified that the intertwined platform dependencies are the main problem. To separate the Toga-Core module form its backend implementations we decided to use the factory pattern instead of the inheritance model that we had before. Now every backend has its own factory that produces the right widgets for the platform it is running on. This way we have a clear separation between Toga-Core and the implementation level. Platform dependencies are now enclosed in the implementation level.

After the new structure was clear I ported Toga-Core as well as the backends for cocoa, iOS and GTK. I did this in the Toga branch (The Big Restructure of Toga [WIP] #185). In practice this meant that I had to manually touch almost every widget of all backends to port them to the new factory pattern.

Challenges

Toga talks to native GUI frameworks, hence I had to get a good understanding about the principles and concepts behind each and every of these frameworks. At times I felt overwhelmed by the combined complexity of all the parts that make up Toga. The following is list:

  • Every Toga backend wraps around a existing and unique framework. To wrap the framework you have to understand the framework.
  • “I love Python, why do I have to understand Objective C”? To effectively work on the iOS and macOS backends I had to learn the basics of Objective C – if only to read the Apple docs.
  • Toga has a lot of moving parts. There are backends, frameworks, libraries to talk to backends, libraries to perform the layout of the UI and more. I spend a good amount of time to understand all of these parts. The following is just a overview of newly acquired knowledge during GSoC:
    • Rubicon-ObjC to talk to the iOS and macOS backends.
    • Colosseum to understand and fix layout problems.
    • AST module to perform the implementation tests.
    • The use of Design Patterns
    • How to structure large projects.
    • Read and understand big and complex code chunks.

Other work I did

Future Work to be Done

  • All my work sits in the PR “The Big Restructure of Toga [WIP]”. After the missing backends, namely Windows and Android, are ported, everything can be merged into master. We have to wait for the missing backends, because the new is incompatible with the old versions and they can’t coexist.
  • The Settings API from my original proposal is not finished because of the above mentioned reasons. I have a first working draft and I will continue working on it after GSoC in this PR.

Shoutout

I would like to thank Russell Keith-Magee for being an awesome Mentor and for all the time he invested in me during GSoC. I also would like to thank the BeeWare community for helping me when ever I had a problem. Thank you!

A new yak for the herd: BeeKeeper

Posted by Russell Keith-Magee on 31 July 2017

Testing is a skill that is a vital part of every programmers training. Learning how to write good tests helps you write more robust code, and ensures that when you've written code that works, it keeps working long into the future. It can also help you write better code in the first place. It turns out that well architected code, with high cohesion and low coupling, is also easy to test - so writing code that is easy to test will almost always result in better overall code quality.

An important step in "levelling up" your testing experience is to start using a Continuous Integration, or CI service. A CI service is a tool that automatically runs your test suite every time someone makes a change - or every time someone proposes a change in the form of a pull request. Using a CI service makes sure that your code always passes your test suite - you can't accidentally slip in a change that breaks a test, because you'll get a big red warning notification.

CI is such an important service that many companies exist solely to provide CI-as-a-service. The BeeWare project has, for various projects, used TravisCI and CircleCI. Both these tools provide free tiers for open source projects, and have generously sponsored BeeWare with capacity upgrades at various times.

However, the BeeWare has had an interesting relationship with commercial CI services. This is for two reasons.

Firstly, our test suites - especially for VOC and Batavia - take a long time to run. These two projects require tests that repeatedly start and shut down virtual machines (for Java and JavaScript, respectively), and no matter how much you optimize the code being tested, the startup and shutdown time of a virtual machine eventually adds up. We also need to run our test suites on multiple versions of Python - at present, we support Python 3.4, 3.5 and 3.6, with 3.7 on the horizon. There are also subtle changes in micro versions that may require testing.

We've been able to speed up the duration of a test run by splitting up the test suite and running parts of the suite in parallel, but that forces us up against the second problem. Commercial CI services usually operate on a subscription model; higher subscriptions providing more simultaneous resources. However, our usage pattern is highly unusual. Most of the year, we get a slow trickle of pull requests that require testing. However, a couple of times a year, we have a large sprint, and we have a flood of contributions over a short period of time. At PyCon US, we have had groups of 40 people submitting patches - and they all need their submissions tested by CI. And time is a factor - the sprints only last a couple of days, so a fast turnaround on testing is essential.

If we were to subscribe to the top tier subscription levels of CircleCi and TravisCI, we still wouldn't have enough resources to support a sprint - but we'd be massively overresourced for the rest of the year. We'd also have to pay $750 or more a month for this service, which is a budget we can't afford.

So - we had a problem. To run our test suite effectively, we needed massively parallelized resources to run a test suite quickly; and at certain times of the year, we need extremely large numbers of these resources.

We also had other automated tasks that we wanted to perform. We wanted to do code linting (automated checking of code style) before a PR was tested. We wanted to check spelling of documentation. And we wanted these tasks to feed back into GitHub as automated comments and specific code review status markers, informing contributors not just that a problem occurred - but what problem occurred, and where in their code.

We also wanted to manage pipeline builds - there's no point in doing a full test of multiple versions of Python once you've established the tests are failing on one version. And there's no point testing at all if there are code style problems.

We also wanted to do things that weren't just testing. We wanted to check that contributor agreements have been signed. We wanted to automate deployment of websites and documentation.

What we had wasn't just a CI problem. It was a problem where we wanted to automatically run arbitrary code, in a safe way, in response to a GitHub event.

I've been trying to find a CI service that can meet our needs for over a year. But over the last year, a few thoughts started to congeal in my head.

  • Amazon provides a API (EC2) that allows you to spool up machines of varying complexity (up to 64 CPUs, with almost 500GB of RAM), and pay by the minute for that usage.
  • Docker provides the tools for configuring, launching, and running code in an isolated fashion
  • Amazon also provides an API (ECS) to control the execution of Docker containers.

There's nothing specific about AWS EC2 or ECS either - you could just as easily use Linode and Kubernetes, or Docker Swarm, or Microsoft Azure... you just need to have the ability to easily spool up machines and run a Docker container. After all: a test suite is just a Docker container that runs a script that starts your test suite. A linting check is a Docker container that runs a script that lints your code. A contributor agreement check is a Docker container that checks the metadata associated with a pull request.

All you need then is a website that can receive GitHub event notifications, and start Docker containers in response.

At the start of July, I found myself between jobs, and uttered the fateful question "How hard could it be?" And so, today, I'm announcing BeeKeeper - BeeWare's own CI service.

BeeKeeper deploys as a Heroku website, written using Django. After configuring it with Github and AWS credentials, it listens for Github webhooks. When a Pull Request or Push is detected, BeeKeeper creates a build request; that build request inspects the code in the repository looking for a beekeeper.yml configuration file. That configuration file describes the pipeline of tasks that will be performed, and for each task, the type of machine that should be used, any environment variables that are required, and the Docker image that will be used.

BeeKeeper also allows the site admin to describe what resources will be used to satisfy builds. A task can say it needs a "High CPU" instance; but the BeeKeeper instance can determine what "High CPU" means - is it 4 CPUs or 32? When those machines are spooled up, how long will they be allowed to sit idle before being shut down again? How many machines should be sitting in the pool permanently? And what is the upper limit on machines that will be started?

A companion tool to BeeKeeper is Waggle. Waggle is a tool that prepares a local definition of a task so it can be used by BeeKeeper - it compiles the Docker image, and uploads it into ECS so that it can be referenced by tasks. (It's called "Waggle" because when worker bees discover a good source of nectar, they return to the hive and do a waggle dance that tells other bees how to find that source).

We've also provided a repository called Comb (named after honey comb, the place bees store all the nectar they find) that defines the task configurations that a BeeKeeper instance can use. We've provided some simple definitions as part of the base Comb repository; each BeeKeeper deployment should have one of these repositories of it's own.

There's still a lot of work to do, but we're already using BeeKeeper to Batavia and VOC, and the upcoming PyCon AU sprints will be our first outing under high-load conditions. Some back-of-envelope calculations predict that for around $50, we'll be able to provide enough CPU resources for each test run to complete running in 10 minutes or less, supporting a sprint of dozens of people.

Although BeeKeeper was written to meet the needs of the BeeWare project, it's an open source tool available for anyone to use. If you'd like to take BeeKeeper for a spin, come along to the sprints, or check out the code on GitHub.

BeeKeeper is also an example of the sort of product you'd see more of if BeeWare development was funded full time. I was able to build BeeKeeper because I had a spare couple of weeks between jobs. There is no end to the tools and libraries like BeeKeeper and Waggle that could be built to support the software development process - all that is missing is the resources needed to develop those tools. If you'd like to see more tools like BeeKeeper in the world, please consider joining the BeeWare project as a financial member. Every little bit helps, and if we can reach a critical mass of supporters, I'll be able to start working on BeeWare full time.