Sunday, 16 December 2012

Reading standard input from GMainLoop

The challenge

This time I stumbled upon a challenge: integrate handling of the standard input data (a very primitive user interaction) with the GMainLoop. I knew there must be a solution to this problem in a project like GLib. But as a busy developer I hardly ever devour entire manuals upfront with all the details. I tend to just skim through the main sections to get an idea as I prefer the RAII approach: don't touch it if you don't need it. Of course all small pieces of knowledge amass and at some point it's worth to read the manual in depth to line up with what's been learned and fill the gaps (these are quite likely huge gaps).

My real reason for handling user input in the GMainLoop was interacting with a GStreamer pipeline. This hopefully makes the story more interesting (introduces some drama). Of course there's a great gst-launch-1.0 tool which in many cases is good enough as it allows to build and run quite complex pipelines. When it comes to interact with a pipeline (sending/receiving events/messages, reconfiguring the pipeline, setting element/object properties dynamically), this tool doesn't suffice. One has to write a GStreamer application.

The other remark is that as a busy developer I prefer to not reinvent the wheel just to learn something (although sometimes it's reasonable) if there's already a solution to a problem. Quite often people complicate their lives or even shoot their own foot by trying the DIY approach. In case of GStreamer they jump straight into writing a GStreamer application. It's even not that horrible when they choose Python but it's not hard enough: they use C! And even in C people are incredibly creative in making their lives miserable. Instead of using gst_pipeline_launch() they perform a copy-pasta concerto with the world creation bit by bit. Very often they would simply get away with gst-launch-1.0 if they just briefly read some documentation.

So why do I need to write a C application?

Why application at all? Can't use gst-launch-1.0 as I need to interact with the pipeline. Why not Python? Need to run it on an embedded platform and at the moment can't afford GObject introspection (additional packages to be installed). So this is how my story looks like Mr. Jeremy Kyle.


Very often code examples come with a remark that the error handling is omitted for code brevity. Fair enough if one dwells on a code snippet and tries to explain some mysterious API intricacy or some coding phenomenon. I try to follow a complementary approach: keep code snippets brief but also provide a simple but complete program that is as accurate as possible when it comes to error handling (to my best knowledge) and does something useful (at least as a proof of concept program).

Let's do it!

The source code for gmainloop-io-example.c is here. Below I'll show just the relevant excerpts.

Being not a genius and a programmer avoiding unnecessary work I simply followed a widely advertised skeleton of a GStreamer application with main() initializing the main loop and a GStreamer bus callback. See the introductory manual for more information. For parsing a pipeline description on the command line I simply stole some code from gst-launch.c:

/* make a null-terminated version of argv */
argvn = g_new0 (char *, argc);
memcpy (argvn, argv + 1, sizeof (char *) * (argc - 1));
  data.pipeline =
    (GstElement *) gst_parse_launchv ((const gchar **) argvn, &error);
g_free (argvn);
Worth noting is that quite often I see that pipeline parsing/initialization failure is not handled properly. The gst_pipeline_launch() documentation says:
Please note that you might get a return value that is not NULL even though the error is set.
Luckily enough my approach to handle erroneous conditions converged with the one in gst-launch.c:
/* handling pipeline creation failure */
if (!data.pipeline) {
  g_printerr ("ERROR: pipeline could not be constructed: %s\n",
    error ? GST_STR_NULL (error->message) : "(unknown error)");
  goto untergang;
} else if (error) {
  g_printerr ("Erroneous pipeline: %s\n", GST_STR_NULL (error->message));
  goto untergang;
The other thing I've noticed is that quite often GLib source ID variables are initialized with -1 while they are unsigned. It feels more natural for them to be initialized with 0 as the g_source_attach() documentation promises them to be greater than 0.

Attaching IO source

My first naive approach was googling for phrases including "IO", "GMainLoop", "handle" and the like. I ended up reading about GSource and actually wrote some source code for my own new GSource class. But that didn't feel right. This was a good time to have another look at the GLib documentation. And Eureka! I stumbled across the IO Channels section where I found that g_io_add_watch() does all I need. Adding a file descriptor to watch in the GMainLoop is as simple as this:

GIOChannel *io = NULL;
guint io_watch_id = 0;

/* standard input callback */
io = g_io_channel_unix_new (STDIN_FILENO);
io_watch_id = g_io_add_watch (io, G_IO_IN, io_callback, &data);
g_io_channel_unref (io);

The io_callback() is called whenever there's something to read from the standard input as I specified G_IO_IN as the condition. The callback function reads one character from the IO channel by calling g_io_channel_read_chars(). If it's a 'q', then it quits the main loop. Note that it de-registers itself by returning FALSE. Otherwise it passes all but the new line characters to a pipeline_stuff() function that is supposed to interpret them and interact with the pipeline. In a more advanced application the user input would be better structured, e. g. with some syntax, keywords etc. For this simple program single characters are used and printed on the console.

static gboolean
io_callback (GIOChannel * io, GIOCondition condition, gpointer data)
  gchar in;

  AppData *app_data = (AppData *) data;
  GError *error = NULL;

  switch (g_io_channel_read_chars (io, &in, 1, NULL, &error)) {

      if ('q' == in) {
        g_main_loop_quit (app_data->loop);
        return FALSE;
      } else if ('\n' != in && !pipeline_stuff (app_data->pipeline, in)) {
        g_warning ("Pipeline stuff failed");

      return TRUE;

      g_printerr ("IO error: %s\n", error->message);
      g_error_free (error);

      return FALSE;

    case G_IO_STATUS_EOF:
      g_warning ("No input data available");
      return TRUE;

      return TRUE;

      g_return_val_if_reached (FALSE);

  return FALSE;

Build and run

Here's the way I build and run it on my Linux PC:

[kris@lenovo-x1 kriscience]$ gcc \
  -o gmainloop-io-example{,.c} \
  $(pkg-config --cflags --libs gstreamer-1.0)
[kris@lenovo-x1 kriscience]$ ./gmainloop-io-example fakesrc ! fakesink
** Message: Running...
eat this
** Message: Pipeline stuff. Received command: e
** Message: Pipeline stuff. Received command: a
** Message: Pipeline stuff. Received command: t
** Message: Pipeline stuff. Received command:  
** Message: Pipeline stuff. Received command: t
** Message: Pipeline stuff. Received command: h
** Message: Pipeline stuff. Received command: i
** Message: Pipeline stuff. Received command: s
** Message: Returned, stopping playback
Note that at the time of this writing I'm using Fedora 17 and I have GStreamer-1.0 built from the sources and installed in a custom location. My PKG_CONFIG_PATH is set accordingly to reflect that.

Sunday, 25 November 2012

Hook up a debugger to a stubborn application

Usually debugging an application is straight forward: fire it up with a debugger. Even if the application requires some context (environment variables, specific directory, arguments etc.), it's not that difficult as either it's scriptable or the IDE will offer a plethora of options.

Now what to do if it comes to debug an application with a tortuous launch procedure, e. g. through a number of scripts and/or some helper processes? Certainly one may eventually find a way to force IDE/debugger to handle it but as a lazy programmer you don't want to go through all of this just for the sake of dirty debugging, do you?

A dirty solution for a dirty debugging is to start the application with whatever requires it to start, make it wait for us and our dirty debugger to attach and then continue under the debugger control. This approach assumes that we can edit our application source code and rebuild it more easily than finding a way to start it with the debugger. This includes a situation when we just want to debug a code in a tiny library that is loaded by some unpleasant application.

Good. Now let's get down to the dirty details.

for ( bool bDbg = true; bDbg; ) {}
I know, it looks utterly silly but it expresses the concept. It will loop until someone tells it to stop. The point is that someone is a "third person", like in a crime story. We'll see later who it is and how they tell the loop to stop.

An improved version of the silly loop is a daft loop:

for ( bool bDbg = getenv("KRIS_DBG"); bDbg; ) {}
This one at least exposes its folly only when the environment variable KRIS_DBG is visible to the application process. Put the loop somewhere in your application/library code where you want to start your debugging session. This will work even for an optimized code as there's not much the compiler can do about the code around bDbg variable. Good for us.


#include <cstdlib>
#include <iostream>

int main() {
    for ( bool bDbg = std::getenv("KRIS_DBG"); bDbg; ) {}

    std::cout << "There you go" << std::endl;
    return EXIT_SUCCESS;
To make this example more interesting, let's compile it with optimizations so referring directly to bDbg variable is impossible (the compiler in this mode doesn't "allocate" it as an entity in the generated code). This might be reasonable if we want to debug an optimized program when we investigate program crash which doesn't occur when compiled with debug information (due to different process memory layout imposed). So here we go with the command line (Linux, x86 PC):
$ g++ -O3 -o test test.cpp
$ KRIS_DBG=1 ./test &
[1] 1258

$ gdb -p 1258
... some GDB introductory stuff ...
OK, I know it's lame but I'm Intel syntax (l)user. That's what I've been taught in the nursery school.
(gdb) set disassembly-flavor intel
Now let's see how our silly loop looks in the disassembly.
(gdb) x/2i $pc
0x8048815 <main+245>: lea    esi,[esi+0x0]
0x8048818 <main+248>: jmp    0x8048815 <main+245>
Rather boring. Execute two instructions in a busy loop. Let's have a look at first 20 lines of the main() function:

(gdb) x/20i main
0x8048720 <main>:    lea    ecx,[esp+0x4]
0x8048724 <main+4>:  and    esp,0xfffffff0
0x8048727 <main+7>:  push   DWORD PTR [ecx-0x4]
0x804872a <main+10>: push   ebp
0x804872b <main+11>: mov    ebp,esp
0x804872d <main+13>: push   edi
0x804872e <main+14>: push   esi
0x804872f <main+15>: push   ebx
0x8048730 <main+16>: push   ecx
0x8048731 <main+17>: sub    esp,0x118
0x8048737 <main+23>: mov    DWORD PTR [esp],0x80488e4
0x804873e <main+30>: call   0x80485a8 <getenv@plt>
0x8048743 <main+35>: test   eax,eax
0x8048745 <main+37>: jne    0x8048815 <main+245>
0x804874b <main+43>: mov    DWORD PTR [esp+0x8],0x4
0x8048753 <main+51>: mov    DWORD PTR [esp+0x4],0x80488ec
0x804875b <main+59>: mov    DWORD PTR [esp],0x8049b20
0x8048762 <main+66>: call   0x8048618 <_ZSt16__ostream_insertIcSt11char_traitsIcEERSt13basic_ostreamIT_T0_ES6_PKS3_i@plt>
0x8048767 <main+71>: mov    eax,ds:0x8049b20
0x804876c <main+76>: mov    eax,DWORD PTR [eax-0xc]
Now all we want to do is to force program to continue execution from the instruction that succeeds the jump instruction <main+37> that led us to the busy loop. Let's continue here: <main+43>. Shall we?
(gdb) set $pc=0x804874b
Now we should be able to set breakpoints, watchpoints etc. and start debugging. In this simple example I just let the program to continue until it exits.

(gdb) cont
There you go

Program exited normally.
If the program is compiled with debug information so that referring to bDbg variable is possible, all the chaff above can be reduced to
(gdb) set bDbg=false

Happy debugging!