Professionalism in Programming, from CVu journal + CVu Journal Vol 27, #3 - July 2015

Browse in :

All > Journal Columns > Professionalism
All > Journals > CVu > 273
Any of these categories - All of these categories

Note: when you create a new publication type, the articles module will automatically use the templates user-display-[publicationtype].xt and user-summary-[publicationtype].xt. If those templates do not exist when you try to preview or display a new article, you'll get this warning :-) Please place your own templates in themes/yourtheme/modules/articles . The templates will get the extension .xt there.

Title: EuroLLVM Conference 2015

Author: Martin Moene

Date: 09 July 2015 21:44:49 +01:00 or Thu, 09 July 2015 21:44:49 +01:00

Summary: Ralph McArdell reports on his experience of the LLVM Conference.

Body:

A week or so before the 2015 ACCU conference, I attended the 2015 EuroLLVM conference. LLVM [1] and associated projects such as Clang [2] are all about computer language translation infrastructure with LLVM itself and many of the associated projects being open source. The conference was held over two days in the Hall Building of Goldsmith College in New Cross, London, UK on Monday and Tuesday the 13th and 14th April 2015. This was my first time at an LLVM conference and I attended because I am interested in LLVM and Clang and wanted to know more. The conference was reasonably priced at Â£60+VAT for the two days, and it was held conveniently close to me. As someone wanting to know more about LLVM, Clang, et al. my main interests were in overview and tutorial sessions rather than the hard technical sessions.

Monday

Conference registration started at 09:00, with refreshments provided from 09:30. The morning was taken up by what was called a â€˜Hackers Labâ€™ â€“ which presumably was for hard core LLVM, Clang and related projectsâ€™ developers. Not being such a developer and having registered, attached my name badge and grabbed a coffee and a pastry snack, I joined many other delegates in hanging around in the common area. There were a number of posters detailing LLVM related endeavours by various organisations which had been put up in the common area. I started reading one titled â€˜LLVM for Deeply Embedded Systemsâ€™ by people from Embecosm [3] and Myre Laboratories [4]. I knew the name Embecosm and recognised one of the authors â€“ Jeremy Bennett â€“ from the Open Source Hardware User Group (OSHUG) [5] and the Parallella [6] SDK forums. I made a comment to a guy standing next to me and we got chatting â€“ he was from Germany and was interested in the static analysis tools Clang provides as he had an unfamiliar large ball-of-mud code base to maintain.

After lunch the conference proper started with a keynote given by Francesco Zappa Nardelli on the trickiness of concurrency in C and C++ even post C11 and C++11. Having revised the memory model, atomic operations and memory orderings that came in with C11 and C++11, we were reminded that certain sorts of compiler optimisations can produce incorrect code in concurrent contexts and told that these sorts of compiler bugs cannot be caught with the current state of compiler testing. Francesco then went on to assert that the problem can be reduced to searching for transforms of sequential code that are not sound for concurrent code and checking for changes to runtime events. A tool has been produced â€“ CppMem [7] (I think) â€“ that can check for these sorts of problems. The talk closed with the take-away that the formalisation of the C and C++ memory model has enabled compiler concurrency testing, and correctness of memory order mapping. However, there is a need to find out what compilers implement and programmers rely on â€“ and please would we take the survey (I did not remember to).

Following right on from the key note were the first of the sessions â€“ with three parallel streams. I ran up stairs to the room where Eric Christopher and David Blaikie were giving a debug info tutorial where I discovered that DWARF [8] is the primary debug information format used by Clang â€“ the C languages front end that uses LLVM as a compilation back end, and as it is a permissive standard â€“ meaning there are many variants â€“ applications that consume DWARF information such as debuggers are not generalised but tend to be tied to the tool that generated the DWARF information. The main point of the tutorial was to introduce us to the LLVM DIBuilder class that eases the pain of adding debug information to a programâ€™s compilation output, with useful hints such as build source location information into the design from the get go as it is difficult to retrofit.

None of the sessions following the mid-afternoon refreshment break seemed to be introductory or tutorial in nature, so I went to a talk given by Mattias Holm on T-EMU 2 [9] â€“ billed as the next generation of LLVM based microprocessor emulator. T-EMU 2 uses C++11 and, unsurprisingly, the LLVM toolchain throughout. Currently T-EMU 2 only supports SPARC processors and like many tools and projects based on LLVM, is library based and provides a command line interface. While the interpreted instruction implementation only yields around 10 MIPS performance this can be raised to around 90 MIPS by using a threaded and optimised approach. It is hoped to raise performance to an estimated 300 MIPS by moving to binary translation.

The main points we were supposed to take on board were that using the LLVM TableGen tool to emulate cores coupled with the use of LLVM intermediate representation (IR) led to rapid emulator development. As a LLVM neophyte I also took away the notion that TableGen â€“ which I had seen used during LLVM builds â€“ seemed like something worth looking into further [10]. Mattias ended by noting that TableGen is not fully documented causing people to resort to reading the code, and that LLVM IR assembler is hard to debug.

For the final session of the day I chose Zoltan Porkolabâ€™s talk on Templight [11] (in fact Templight 2) â€“ a Clang extension for debugging and profiling C++ template metaprograms â€“ which sounded pretty much my sort of talk! The Templight developers have patched Clang to add options for Templight. Compiling C++ code with the Templight options active causes a trace file in XML format to be produced that can be used as input to front end analysis tools. The current tools have been developed using Graphviz and Qt and allow template instantiations to be displayed and analysed in a step by step fashion or instantiation timings and memory usage to be analysed. The Metashell project [12] uses Templight to provide an interactive template meta programming REP and there is a Templight rewrite by Mikael Persson available on GitHub [13].

In the evening there were drinks and dinner at the London Bridge Hilton hotel. As there were a couple of hours to spend before the drinks I adjourned to a local pub with a couple of people I had met for a chat and a drink. At dinner I sat next to Andrew Ayers from Microsoftâ€™s .NET team â€“ who was giving a talk the following day on CoreCLR garbage collection support in the LLVM MSIL compiler â€“ a talk I would have liked to go to if it did not clash with another talk I wanted to catch. I remember chatting a bit about C++ templates and C# / .NET generics.

Tuesday

The second and final day of the conference got off to a start at 09:00 with a keynote given by Ivan Goddard from Mill Computing [14] on using the Clang and LLVM toolchain for their â€˜truly alienâ€™ Mill CPU architecture and the problems they have encountered. Ivan started by asking how many people had heard of the Mill CPU and on finding most people had not, spent the first part of the talk on a quick tour of the Mill CPU architecture (for those interested check out the documentation section on the Mill Computing web site [15]). Next Ivan went through how their compiler team were using LLVM and Clang and the problems they had encountered â€“ including LLVM not liking the Millâ€™s large, very regular instruction set; that LLVM and Clang lose â€˜pointerhoodâ€™ as pointers tend to devolve to integers which is not good for the Mill as it uses a specific 64-bit pointer type; that LLVM cannot cope with the high level of function call support the Mill provides; and refactoring LLVM code on the trunk breaks other targets. The Mill macro assembler is interesting in that assembler instructions are C++ functions and C++ is the assembler macro language. First you compile the assembler C++ program, and then run the resultant executable which generates the assembler code. To end, Ivan offered some code they use to automatically produce specific Mill family member instruction sets from specifications as an example of an alternative to the LLVM TableGen tool, which, it appears, is in need of having something done about it â€“ but no one knows what. Finally, Ivan appealed to the LLVM and Clang community for help fixing the problems Mill Computing had experienced.

The first of Tuesdayâ€™s sessions I attended was given by Liam Fitzpatrick and Marco Roodzant about LLVM-TURBO [16] which turned out to be a commercial product aimed at those needing to create code generators for their embedded processors and who do not wish to get their hands dirty with Clang and LLVM directly. The selling point is that LLVM-TURBO requires less time and people with the example of using vanilla LLVM requiring 10 people over 2 years while using LLVM-TURBO required 3 people over 4 months. LLVM-TURBO uses what appears to be their own CoSy compiler development system and bridges between LLVM and CoSy formats.

To take us up to lunch were a set of short 5 minute lightning talks â€“ a familiar concept to those who have attended an ACCU conference in recent years. Arnaud de Grandmaison started the proceedings by informing us that using vectorisation to speed up computations such as colour space conversion and matrix multiplication can give a two times speed increase. Dmitry Borisenkov reported on an LLVM based ahead of time Javascript compiler that although in an alpha state can be up to two times faster than the Google V8 engine. Tilmann Scheller gave us some tips on building Clang and LLVM as quickly as possible and also spoke about the new 2.0 version of the OpenCL SPIR intermediate representation [17] saying that unlike the original it is no longer a subset of LLVM IR but can easily be mapped to LLVM using a small decoder. Next Frej Drejhammar and Lars Rasmusson presented their proposal for LLVM extensions allowing the generation of patch points while Jiangning Liu, Pablo Barrio and Kevin Qin explained their patch that uses heuristics to improve the LLVM inlinerâ€™s performance. Alberto Magni introduced Symengine that analyses CPUâ†”GPU transactions in order to optimise data transfers between CPU and GPU. Edward Jones explained how a patch to DejaGnu [18], used for regression testing of GCC, allows it to be used to regression test Clang which is especially useful for embedded systems as they can use the remote execution feature. Hao Liu, James Molloy and Jiangning Liu presented a method of vectorising interleaved memory accesses. Pablo Barrio. Chandler Carruth and James Molloy meanwhile returned to inlining with a description of their attempts to allow inlining of recursive functions, and how the final attempt using a stack to remove recursion in fact ended up producing slower code for a pathological Fibonacci series test case. Russell Gallop presented a method of verifying that code generation is unaffected by compiler options such as -g (generate debug information) and -S (preprocess and compile but do not assemble or link) by compiling with and without the option(s) of interest and comparing the generated output â€“ doing so can help locate subtle bugs across the compiler code base. Kevin Funk explained how moving the KDevelop IDEâ€™s C and C++ editor language support to libclang provided full C and C++ language parsing and they got ObjectiveC parsing for free! Finally, Alexander Richardson and David Chisnall introduced a Clang extension that optimised memory allocation for objects using the C++ PImpl idiom [19] by combining allocation in a similar way to std::make_shared. If I understood correctly the extension, through the use of a custom attribute, would also create the wrapper class that wraps the implementation class instance pointer.

After lunch I decided to go to Deepak Panickal and Ewan Crawfordâ€™s session on why one might want to use LLDB [20], the LLVM debugger. It is designed with a clean and maintainable plugin architecture, works on all major platforms â€“ with the caveat that there is more work to do on MS Windows support, maintains up to date language support by using libclang, has both C++ and Python APIs to add LLDB support to applications and automate repetitive tasks, has an internal Python interpreter, allowing scripts to be run from breakpoints, has a GDB compatible machine interface and that it should be easy to switch to from GDB. Got that? Good.

Following the LLDB session I went straight into Daniel Krupp, Gyorgy Orban, Gabor Horvath and Bence Babatiâ€™s talk on their industrial experiences of using the Clang static analysis tool. It seems that while Clang and its associated tools can form an impressive checker framework they do have their usability problems which makes their use quite fiddly. To mitigate these problems the authorsâ€™ team at Ericsson created a project build workflow together with viewer tools to smooth over the rough edges. There are plans to open source the tools and submit the code to the community, providing their employer has no objections.

After the mid afternoon refreshment break I went to JF Bastienâ€™s talk on using C++ on the web without getting users pwned. It seems JF works for Google and the talk concerned the Chrome Native Client (NaCl) [21] and Portable Native Client (PNaCl) and the security measures they use. As I had never seen (P)NaCl in action the most impressive thing about this talk were the demonstrations of things like bash shells running in Chrome along with applications like Emacs and Vim. It seems that the native client built in to recent Chrome browsers, and I think Chromebooks, provides native code â€“ currently written in C and/or C++ â€“ to execute in a sandboxed Linux like OS environment. PNaCl executables use the LLVM toolchain to compile down to LLVM IR that is then compiled to native code when downloaded as part of a web page load. Of course you have to be paranoid about running native code so other than the sandboxed pseudo-OS environment they use other techniques such as random instruction and register selection when compiling and using fuzzing to help check for bugs.

Next I went straight to my final session of the conference given by Siva Chandra Reddy on using LLDB for debugging. Siva started by giving a report on the LLDB project status. Encouragingly the project now has 11 developers, has support for Linux and Android, with Windows support under active development. Remote debugging support has recently been checked in, is documented and now uses a remote debug server. X86 and X86-64 support is available now, with ARM and ARM64 support under development. On Windows Win32 support is mostly complete with Win64 coming along. Next some details on using remote debugging were given first when both debugger and debuggee are the same platform and then the more complex case where they are different as is common in the embedded world. Finally Siva covered debugging and testing the debugger, mentioning that LLDB has very good logging facilities as well as special command line arguments and environment variables. As for testing, because LLDB is very interactive and platform dependent they use a Python test framework in which each Python test case has an accompanying C/C++/ObjectiveC file that is used as the thing that is to be debugged.

There was only the conference close session left and while waiting for it to start in the lecture theater it was to be held in I caught the end of the previous session on a Fortran front end for LLVM and my ears picked up the name â€˜Flangâ€™.