• Setting Up A Test-Driven React Project From Scratch - Part 2: webpack niceties

    This post will explore some more niceties that using webpack provides.

    ESLint

    ESLint functionality is provided by way of eslint-loader:

    npm install eslint-loader --save-dev
    

    To ensure that linting only runs on your un-transpiled ES6 source files, specify the loader as a preLoader:

    module: {
      preLoaders: [
        {
          test: /\.jsx?$/,
          loaders: ['eslint-loader'],
          excludes: /node_modules/
        }
      ],
    
      loaders: [
        {
          test: /\.jsx?$/,
          loaders: ['babel-loader'],
          excludes: /node_modules/
        }
      ]
    }
    

    ESLint may seem to appear to work right out of the box, but in order for it to function effectively, you need to configure ESLint to suit your coding style/needs. eslint-loader takes in a config file whose path is specified in webpack.config.js:

    eslint: {
      configFile: path.resolve(__dirname, ".eslintrc")
    }
    

    In your .eslintrc (the above setting specifies it in your project root), you may wish to include at least the following sane barebones config:

    {
      "ecmaFeatures": {
        "arrowFunctions": true,
        "blockBindings": true,
        "classes": true,
        "destructuring": true,
        "forOf": true,
        "modules": true,
        "jsx": true
      },
      "rules": {
        "quotes": [2, "single"],
        "strict": [2, "never"],
        "react/jsx-uses-react": 2,
        "react/jsx-uses-vars": 2,
        "react/react-in-jsx-scope": 2
      }
    }
    

    This barely scratches the surface of how ESLint can be configured. The ESLint docs are excellent.

    ESLint React

    eslint-plugin-react is a ESLint plugin that provides React-specific linting rules. To install:

    npm install eslint-plugin-react --save-dev
    

    Then, modify your .eslintrc file to add the following:

    {
      "plugins": [ "react" ]
    }
    

    Hot Reloading

    TODO

  • Setting Up A Test-Driven React Project From Scratch - Part 1: webpack

    You will learn how to scaffold a React project from scratch with Karma and webpack.

    I’ve tried to make it as unopinionated as possible. Hopefully, once you understand the rationale behind each of the moving parts, you can roll your own or use one of the many templates available elsewhere.

    At the end of this part, you would have set up webpack with:

    • babel-loader
    • webpack-dev-server
    • HtmlWebpackPlugin
    • Source maps with the -d flag

    Let’s create a simple project that allows users to create reviews of beer and populate a list. We’ll call it Beerist.

    Let’s create a project directory:

    mkdir beerist && cd beerist
    

    Initialize a package.json by filling in the fields as appropriate:

    npm init
    
    {
      "name": "beerist",
      "version": "1.0.0",
      "description": "A beer review site",
      "main": "index.js",
      "dependencies": {},
      "devDependencies": {},
      "scripts": {
        "test": "echo \"Error: no test specified\" && exit 1"
      },
      "author": "Lau Siaw Young",
      "license": "ISC"
    }
    

    Create a src directory that will hold all our source files. Within src, create a js directory and a index.js that will act as an entry point for the app. Our directory now looks like this:

    ├── package.json
    ├── src
    │   └── js
    │       └── index.js
    └── webpack.config.js
    

    In index.js, let’s just write a simple one-liner for the purposes of testing:

    console.log("index.js is loaded!");
    

    webpack

    webpack was born as a agnostic module loader. Its Motivation page introduces itself quite nicely.

    Since its inception, webpack has grown to encompass some elements of build tooling as well, which greatly reduces the need for Gulp or Grunt.

    We’ll install webpack1 first so that we can write our React files in ES6 (or ES2015, or ES6+, or ES7, or whatever they call it nowadays) from the get-go and have webpack run the files through Babel to transpile our ES6 down to ES5.

    We’ll also install webpack’s dev server so that we can serve the files locally using a built-in Express server and sockets (for hotloading):

    npm install webpack webpack-dev-server --save-dev
    

    This creates a node_modules directory if it hasn’t previously been created, and downloads the latest version of all the modules specified from npm. --save-dev saves it as a development dependency entry within your package.json2:

    "devDependencies": {
      "node-libs-browser": "^0.5.2",
      "webpack": "^1.10.1",
      "webpack-dev-server": "^1.10.1"
    }
    

    Create a file called webpack.config.js, which tells webpack what to do:

    var webpack = require('webpack')
    var path    = require('path')
    
    
    module.exports = {
      
      entry: [
        path.resolve(__dirname, "./src/js/index.js")
      ],
    
      output: {
        path: path.resolve(__dirname, 'dist'),
        filename: "bundle.js"
      }
    
    }
    

    The config starts off by requiring webpack and path, which is a built-in Node library that provides utility methods for manipulating file path strings. In particular, we use path.resolve together with the “global” variable __dirname to resolve relative file paths to absolute ones.

    Now how do we run webpack and/or its dev server? If you’ve installed webpack and/or webpack-dev-server globally, you can simply run webpack or webpack-dev-server on the command line to fire it up. Else, you can call it from an npm script inside package.json3:

    "scripts": {
      "start": "webpack-dev-server",
      "build": "webpack"
    },
    

    Now, we can run npm start or npm run build (check out the difference between the two for yourself). Our directory should look like this:

    ├── dist
    │   └── bundle.js
    ├── package.json
    ├── src
    │   └── js
    │       └── index.js
    └── webpack.config.js
    

    If you open bundle.js, you’ll see some crazy stuff that webpack is doing to tie all your dependencies together. Of course, you’ll see your source code - all the way at the bottom:

    /***/ function(module, exports) {
    
      console.log("index.js works");
    
    /***/ }
    /******/ ]);
    

    To see it in action in the browser though, we’ll need a HTML entry point that loads our bundle.js.

    Plugins

    Introducing: webpack plugins, which provide additional functionality to the webpack building and bundling process. We’ll start by installing 2 of them:

    npm install babel-loader html-webpack-plugin --save-dev
    

    html-webpack-plugin is a plugin that we’re using here to generate a HTML file that includes all our webpack bundle(s).

    Loaders are a type of plugin which transforms files from one state to another. babel-loader, as mentioned earlier, transforms ES6 code into ES5 code so that current browsers can run our app.

    Our webpack.config.js is changed to accommodate these new plugins.

    Adding a new plugin is as simple as requiring it, and adding a new instance of it in the plugins array.

    Adding a loader is only slightly more complicated - the additional thing we need to specify is test, which tells the loader which files are eligible to go through the loader for transformation. Here, we specify with a regular expression that only files ending with .js or .jsx are eligible to be transformed by babel-loader.

    I’ve demarcated diffed lines with // for single line changes and //- for block changes.

    var webpack = require('webpack')
    var path    = require('path')
    var HtmlWebpackPlugin = require('html-webpack-plugin') //
    
    module.exports = {
    
      entry: [
        path.resolve(__dirname, "./src/js/index.js")
      ],
    
      output: {
        path: path.resolve(__dirname, 'dist'),
        filename: "bundle.js"
      },
    
      module: { //-
        loaders: [
          {
            test: /\.jsx?$/,
            loaders: ['babel-loader']
          }
        ]
      },
    
      plugins: [new HtmlWebpackPlugin()] //-
    
    }
    

    To confirm that babel-loader is doing its job, let’s write some ES6 code in index.js:

    console.log("index.js works");
    
    let x = 5; // wow so ES6
    console.log(x);
    

    Run webpack-dev-server - you should see an index.html in the dist folder now:

    ├── dist
    │   ├── bundle.js
    │   └── index.html
    ├── package.json
    ├── src
    │   └── js
    │       └── index.js
    └── webpack.config.js
    

    Navigate to http://localhost:8080 (I’m using Chrome) and open up your console:

    index.js works
    5
    

    To demonstrate the agnostic aspect of webpack’s module loading, let’s write two throwaway files:

    ├── dist
    │   ├── bundle.js
    │   └── index.html
    ├── package.json
    ├── src
    │   └── js
    │       ├── index.js
    │       ├── test1.js
    │       └── test2.js
    └── webpack.config.js
    

    test1.js and test2.js contains just one line each: console.log("test1/2.js is imported").

    We’ll then import these two files in two different ways in index.js - test1.js using ES6 module syntax, and test2.js using CommonJS/browserify syntax4:

    console.log("index.js works");
    
    let something = 5;
    console.log(something);
    
    import Test1 from './test1.js'
    require('./test2.js');
    

    Refreshing the browser yields:

    test1.js is imported
    index.js works
    5
    test2.js is imported
    

    Immediately, you’ll see that ES6 module imports are hoisted to the top of the file, whereas CommonJS imports are run in-place (this is a simplistic treatment of the issue, but it is generally true).

    Source Maps

    You’ll also notice the line number on the right says something like bundle.js(59), even for console.logs emanating from different source files. This could potentially make your debugging very difficult in the future when you have many source files and all you have is bundle.js(38593) or some such to work backwards with.

    Source maps allow the browser to map bundled JS files back to their unbundled state, so that things like errors’ and console.logs’ line numbers correspond back to their sources.

    The simplest way to enable source maps is by indicating the -d flag5. In the package.json:

    "scripts": {
      "start": "webpack-dev-server -d"
    }
    

    The -d flag is actually short for --debug --devtool source-map --output-pathinfo. Each of these options are explained in further detail here.

    In the next part, we’ll look at hot reloading with hot-loader, linting with eslint and eslint-react, as well as a couple of other useful webpack plugins.

    Footnotes
    1. We’re installing it locally for now, but you can also install it globally by indicating a -g flag. This will allow you access to the webpack command on the command line.

    2. In case you were wondering, the ^ in "^0.5.2" means that the package.json will update you to the latest release within a specified major version, which is indicated the first number in the version. In this case, ^0.5.2 is valid for all versions that match 0.x.x.

    3. The reason we’re doing this is because running npm scripts specified using package.json will add node_modules/.bin to the path, thus making node_modules/.bin/webpack and node_modules/.bin/webpack-dev-server available to the script to use (open up the node_modules folder to see for yourself, its there).

    4. Regardless of ES6 module synatax or CommonJS require syntax, please put all your imports at the top of the file.

    5. One can also enable source maps for other assets using the built-in source maps plugin:

      var webpack = require('webpack')
      var path    = require('path')
      var HtmlWebpackPlugin = require('html-webpack-plugin')
      
      module.exports = {
      
        entry: [
          path.resolve(__dirname, "./src/js/index.js")
        ],
      
        output: {
          path: path.resolve(__dirname, 'dist'),
          filename: "bundle.js",
          sourceMapFilename: "bundle.js.map"
        },
      
        module: {
          loaders: [
            {
              test: /\.jsx?$/,
              loaders: ['babel-loader'],
              include: path.resolve(__dirname, 'src/js')
            }
          ]
        },
      
        plugins: [
          new HtmlWebpackPlugin(),
          new webpack.SourceMapDevToolPlugin() //
        ]
      
      }
      

      The default behavior of the plugin is to map only JavaScript files, and to inline the source map. These, along with other options, can be changed.

  • Apply, Apply

    While going through Jasmine’s source code, I came across a weird idiom:

    function attemptAsync(queueableFn) {
      var clearTimeout = function () {
          Function.prototype.apply.apply(self.timer.clearTimeout, [j$.getGlobal(), [timeoutId]]);
        },
        next = once(function () {
          clearTimeout(timeoutId);
          self.run(queueableFns, iterativeIndex + 1);
        }),
        timeoutId;
      ...
    }
    

    What in the world is Function.prototype.apply.apply? I decided to find out for myself.

    As a refresher, let’s write a function coolFx that simply prints out its this and arguments. When invoked as a method on a function, apply executes its caller with its first parameter as the caller’s this, and its second parameter as an array of arguments as the caller’s argument list. The parameters to apply can be anything, even primitives, as demonstrated below:

    function coolFx() {
      console.log("My cool function is running!!");
      console.log("coolFx's this:", this);
      console.log("coolFx's arguments:", arguments);
    }
    
    var anything = 1;
    
    coolFx.apply(anything, [anything, anything, anything]);
    

    prints:

    My cool function running!!
    coolFx's this: Number {[[PrimitiveValue]]: 1}
    coolFx's arguments: [1,1,1]
    

    To preserve our sanity later on, let’s see apply in more detail by defining a wrapper around apply:

    Function.prototype.firstApply = function(x,y) {
      console.log("This is firstApply's this: ", this);
      console.log("This is firstApply's first parameter:", x);
      console.log("This is firstApply's second parameter:", y);
      this.apply(x,y)
    };
    Function.prototype.secondApply = function(x,y) {
      console.log("This is secondApply's this: ", this);
      console.log("This is secondApply's first parameter:", x);
      console.log("This is secondApply's second parameter:", y);
      this.apply(x,y)
    };
    

    Running coolFx.firstApply with this modified version of apply prints:

    This is firstApply's this:  coolFx()
    This is firstApply's first parameter: 1
    This is firstApply's second parameter: [1]
    
    My cool function is running!!
    coolFx's this: Number {[[PrimitiveValue]]: 1}
    coolFx's arguments: [1]
    

    So what does it mean when you call apply again on itself, as in coolFx.firstApply.secondApply?

    coolFx.firstApply.secondApply(anything, [anything, anything])
    
    This is the secondApply's this: Function.firstApply(x, y)
    This is the secondApply's first parameter: 1
    This is the secondApply's second parameter: [1, 1]
    
    This is the firstApply's this: Number {[[PrimitiveValue]]: 1}
    This is the firstApply's first parameter: 1
    This is the firstApply's second parameter: 1
    
    Uncaught TypeError: this.apply is not a function
    

    Calling secondApply on firstApply means that secondApply will execute firstApply, and since firstApply needs to be executed in the context of a function, the first parameter must be a function. (I only know this from working backwards from the interpreter’s errors, go figure).

    In this case, since this in firstApply is the primitive 1, 1.apply is obviously not a function since it doesn’t inherit from Function.prototype.

    Now that we know that the secondApply expects a function as its first parameter, let’s try putting coolFx there:

    coolFx.firstApply.secondApply(coolFx, [anything, anything])
    
    This is the secondApply's this: Function.firstApply(x, y)
    This is the secondApply's first parameter: coolFx()
    This is the secondApply's second parameter: [1, 1]
    
    This is the firstApply's this: coolFx()
    This is the firstApply's first parameter: 1
    This is the firstApply's second parameter: 1
    
    Uncaught TypeError: Function.prototype.apply: Arguments list has wrong type
    

    Okay, so at least it’s a different error now. Why is firstApply complaining that its argument list (the second parameter) has the wrong type? That’s because its trying to call coolFx.apply(1,1)! Don’t forget that apply expects an array as its second parameter.

    So I guess we can wrap the second anything in an array:

    coolFx.firstApply.secondApply(coolFx, [anything, [anything]])
    
    This is the secondApply's this: Function.firstApply(x, y)
    This is the secondApply's first parameter: coolFx()
    This is the secondApply's second parameter: [1, 1]
    
    This is the firstApply's this: coolFx()
    This is the firstApply's first parameter: 1
    This is the firstApply's second parameter: [1]
    
    My cool function is running!!
    coolFx's this: Number {[[PrimitiveValue]]: 1}
    coolFx's arguments: [1]
    

    So it finally works! Wait, doesn’t this look… familiar? It should, because what its doing is exactly what coolFx.apply(anything, [anything]) is doing (see above) - executing coolFx with this as anything and its arguments as [anything].

    And isn’t it stupid that we’re mentioning coolFx twice? What exactly does the first coolFx even do? Let’s get rid of it:

    Function.prototype.firstApply.secondApply(coolFx, [anything, [anything]])
    
    This is the secondApply's this: Function.firstApply(x, y)
    This is the secondApply's first parameter: coolFx()
    This is the secondApply's second parameter: [1, 1]
    
    This is the firstApply's this: coolFx()
    This is the firstApply's first parameter: 1
    This is the firstApply's second parameter: [1]
    
    My cool function is running!!
    coolFx's this: Number {[[PrimitiveValue]]: 1}
    coolFx's arguments: [1]
    

    It turns out that it doesn’t matter which function calls firstApply, because the only function that is going to be executed is secondApply’s. firstApply’s original context is irrelevant (imagine it as behaving as a static method). It’s just there to facilitate this process by helping “promote” the first parameter (the function) into the executing function itself.

    Maybe its a little clearer if you remove secondApply from the equation:

    This is the firstApply's this: Empty()
    This is the firstApply's first parameter: coolFx()
    This is the firstApply's second parameter: [1, Array[1]]
    

    So there you have it. And after all this trouble, I still don’t actually know why Jasmine uses this idiom. Instead of

    Function.prototype.apply.apply(self.timer.clearTimeout, [j$.getGlobal(), [timeoutId]]);
    

    It seems the following is equivalent:

    self.time.clearTimeout.apply(j$.getGlobal(), [timeoutId])
    

    Bonus

    What happens if we chain more than 2 applys together?

    Function.prototype.firstApply.secondApply.thirdApply(coolFx, [anything, [anything]])
    
    This is the thirdApply's this: Function.secondApply(x, y)
    This is the thirdApply's first parameter: coolFx()
    This is the thirdApply's second parameter: [1, Array[1]]
    
    This is the secondApply's this: coolFx()
    This is the secondApply's first parameter: 1
    This is the secondApply's second parameter: [1]
    
    My cool function is running!!
    coolFx's this: Number {[[PrimitiveValue]]: 1}
    coolFx's arguments: [1]
    

    Looks like firstApply doesn’t even get run at all. This is puzzling at first, but makes sense when you think about it - secondApply’s original this, firstApply, has been “diverted” to coolFx (remember when I said above that firstApply’s original context is irrelevant?) Thus, when secondApply does its job, its coolFx that it executes. After its done, firstApply is all but forgotten. Of course, this remains the same regardless of how many applys you chain.

    More Bonus

    So you’ve decided against your better self to use this idiom in your code, but you really don’t like how you have nest the eventual arguments in another array. There’s actually a way to work around that - use call1 instead!

    randomFunction.call.secondApply(coolFx,[anything, anything, anything, anything, anything]);
    
    This is secondApply's this:  call()
    This is secondApply's first parameter: coolFx()
    This is secondApply's second parameter: [1, 1, 1, 1, 1]
    
    My cool function is running!!
    coolFx's this: Number {[[PrimitiveValue]]: 1}
    coolFx's arguments: [1, 1, 1, 1]
    

    (Okay, this post turned out to be way longer than I had anticipated.)

    Footnotes
    1. In case you’re too lazy to follow through on the link, I’ll paste it here for your convenience, courtesy of MDN:

      While the syntax of this function is almost identical to that of apply(), the fundamental difference is that call() accepts an argument list, while apply() accepts a single array of arguments.

  • Rails Boot Sequence (Part 1)

    Today, we investigate Rails’ boot sequence by observing what happens when we run rails console. Part 2 will look at rails server. Github links to relevant files are provided as necessary.

    Our journey begins inside the rails binary1, which is executed by ruby_executable_hooks2:

    #!/usr/bin/env ruby_executable_hooks
    # This file was generated by RubyGems.
    # The application 'railties' is installed as part of a gem, and
    # this file is here to facilitate running it.
    
    require 'rubygems'
    version = ">= 0"
    
    if ARGV.first
      str = ARGV.first
      str = str.dup.force_encoding("BINARY") if str.respond_to? :force_encoding
      if str =~ /\A_(.*)_\z/ and Gem::Version.correct?($1) then
        version = $1
        ARGV.shift
      end
    end
    
    gem 'railties', version
    load Gem.bin_path('railties', 'rails', version)
    

    It calls load Gem.bin_path('railties', 'rails', version), which corresponds to gems/railties-4.X.X/bin/rails.rb:

    #!/usr/bin/env ruby
    
    git_path = File.expand_path('../../../.git', __FILE__)
    
    if File.exist?(git_path)
      railties_path = File.expand_path('../../lib', __FILE__)
      $:.unshift(railties_path)
    end
    require "rails/cli"
    

    In gems/railties-4.X.X/lib/rails/cli.rb:

    require 'rails/app_loader'
    
    # If we are inside a Rails application this method performs an exec and thus
    # the rest of this script is not run.
    Rails::AppRailsLoader.exec_app
    ...
    

    exec_app is in charge of executing the bin/rails inside your Rails application. It will look for it recursively, meaning that you can call rails anywhere in your application directory. In fact, rails server or rails console is equivalent to calling ruby bin/rails server or ruby bin/rails console See the abridged contents of rails/app_loader.rb below:

    module Rails
      module AppLoader # :nodoc:
        extend self
    
        RUBY = Gem.ruby
        EXECUTABLES = ['bin/rails', 'script/rails']
    
        def exec_app
          original_cwd = Dir.pwd
    
          loop do
            
            (code to check for the executable and execute it if found)
    
            # If we exhaust the search there is no executable, this could be a
            # call to generate a new application, so restore the original cwd.
            Dir.chdir(original_cwd) and return if Pathname.new(Dir.pwd).root?
    
            # Otherwise keep moving upwards in search of an executable.
            Dir.chdir('..')
          end
        end
    
        def find_executable
          EXECUTABLES.find { |exe| File.file?(exe) }
        end
      end
    end
    

    Next, we turn our focus temporarily to your Rails application. In bin/rails, two files are required:

    #!/usr/bin/env ruby
    
    ### The below part will be present if you use spring
    # begin
    #  load File.expand_path("../spring", __FILE__)
    # rescue LoadError
    # end
    APP_PATH = File.expand_path('../../config/application', __FILE__)
    require_relative '../config/boot'
    require 'rails/commands'
    

    ../config/boot (in your app directory) determines the location of the Gemfile and allows Bundler to configure the load path for your Gemfile’s dependencies.

    ENV['BUNDLE_GEMFILE'] ||= File.expand_path('../../Gemfile', __FILE__)
    
    require 'bundler/setup' # Set up gems listed in the Gemfile.
    

    rails/commands parses options passed in as command line arguments, including alias mapping (c for console, g for generate, etc.)

    ARGV << '--help' if ARGV.empty?
    
    aliases = {
      "g"  => "generate",
      "d"  => "destroy",
      "c"  => "console",
      "s"  => "server",
      "db" => "dbconsole",
      "r"  => "runner"
    }
    
    command = ARGV.shift
    command = aliases[command] || command
    
    require 'rails/commands/commands_tasks'
    
    Rails::CommandsTasks.new(ARGV).run_command!(command)
    

    rails/commands/commands_tasks.rb is in charge of throwing errors in the case of invalid commands, or delegating valid commands to the respective methods, themselves split into files in the rails/commands directory:

    $ ls
    application.rb    console.rb        destroy.rb        plugin.rb         server.rb
    commands_tasks.rb dbconsole.rb      generate.rb       runner.rb
    

    For example, if rails console is run, the console method in commands_tasks.rb requires console.rb and runs the start class method from the Rails::Console class, passing it your application as the first argument (command_tasks.rb is made known of your application by requiring APP_PATH, which you’ve kindly provided previously in bin/rails):

    def console
      require_command!("console")
      options = Rails::Console.parse_arguments(argv)
    
      # RAILS_ENV needs to be set before config/application is required
      ENV['RAILS_ENV'] = options[:environment] if options[:environment]
    
      # shift ARGV so IRB doesn't freak
      shift_argv!
    
      require_application_and_environment!
      Rails::Console.start(Rails.application, options)
    end
    
    # some ways down
    private
      def require_command!(command)
        require "rails/commands/#{command}"
      end
      
      def require_application_and_environment!
        require APP_PATH
        Rails.application.require_environment!
      end
    

    In rails/commands/console.rb, you can see the start class method instantiating itself and calling the new instance’s start instance method:

    class << self # old idiom for defining class methods, equivalent to def self.start
      def start(*args)
        new(*args).start
      end
    

    As it is instantiated, @app is set as your Rails application, and @console is set to app.config.console if present, or defaults to IRB:

    def initialize(app, options={})
      @app     = app
      @options = options
    
      app.sandbox = sandbox?
      app.load_console
    
      @console = app.config.console || IRB
    end
    

    Let’s see if the above code actually works by setting your application config to use Pry as the console instead:

    # don't forget to add gem 'pry' to your Gemfile and bundle
    # in coolrailsapp/config/application.rb
    module CoolRailsApp
      class Application < Rails::Application
        ...
        config.console = Pry
      end
    end
    
    $ rails c
    Loading development environment (Rails 4.2.3)
    [1] pry(main)>
    

    Great success! Now let’s look at the actual start instance method, whose code is relatively self-explanatory:

    def start
      if RUBY_VERSION < '2.0.0'
        require_debugger if debugger?
      end
    
      set_environment! if environment?
    
      if sandbox?
        puts "Loading #{Rails.env} environment in sandbox (Rails #{Rails.version})"
        puts "Any modifications you make will be rolled back on exit"
      else
        puts "Loading #{Rails.env} environment (Rails #{Rails.version})"
      end
    
      if defined?(console::ExtendCommandBundle)
        console::ExtendCommandBundle.send :include, Rails::ConsoleMethods
      end
      console.start
    end
    

    Finally, console.start boots the console3.

    Next, we’ll look at the code path taken by rails server.

    Footnotes
    1. As indicated in the comments, this file is auto-generated by RubyGems. How does it know to load Rails, as in the last line (load Gem.bin_path('railties', 'rails', version))? Taking a look in railties.gemspec gives us the answer:

      s.bindir      = 'exe'
      s.executables = ['rails']
      

      What does the above mean? RubyGem’s documentation:

      EXECUTABLES

      Executables included in the gem.

      For example, the rake gem has rake as an executable. You don’t specify the full path (as in bin/rake); all application-style files are expected to be found in bindir.

      Take a look inside the exe directory - its contents will be very familiar soon :)

    2. The binary is defined by the sha-bang to be executed by ruby_executable hooks, which is a thin wrapper that allows RubyGems to run initialization hooks (Gem::ExecutableHooks.run($0)) before Ruby runs the actual code (eval File.read($0), binding, $0). This is what the actual ruby_executable_hooks binary looks like:

      #!/usr/bin/env ruby
      
      title = "ruby #{ARGV*" "}"
      $0    = ARGV.shift
      Process.setproctitle(title) if Process.methods.include?(:setproctitle)
      
      require 'rubygems'
      begin
        require 'executable-hooks/hooks'
        Gem::ExecutableHooks.run($0)
      rescue LoadError
        warn "unable to load executable-hooks/hooks" if ENV.key?('ExecutableHooks_DEBUG')
      end
      
      eval File.read($0), binding, $0
      

    3. You can test this out for yourself with just 3 lines of code. Create a file with the following:

      require 'irb'
      x = IRB
      x.start
      

      Run it and see what happens:

      $ ruby ~/test.rb
      >>
      

  • Notes on "Rebuilding a Web Server"

    Some notes I took while watching Rebuilding a Web Server, a brief walkthrough by Marc-André Cournoyer on writing a simple Rack-compliant web server. The code for the class is here.

    Concurrency

    The entire stack looks like this:

    Browser -> Socket -> HTTP Parser -> Rack -> Your App
    

    There’s also a scheduler running alongside, handling concurrent connections. Such a scheduler can be implemented in different ways: threads, pre-forked processes, or an event loop.

    Threads

    A naive implementation would look like this, spawning a new thread for each incoming socket connection:

    # inside the server's class definition
    ...
      def start
        loop do
          socket = @server.accept
          Thread.new do
            connection = Connection.new(socket, @app)
            connection.process
          end
        end
      end
    ...
    

    Web servers like Puma use threads. Thread spawning is quite expensive, so web servers that use threads for concurrency will usually spawn a number of threads (thread pool) on bootup and reuse them.

    Pre-forked Processes

    Preforking is a popular concurrency model used by servers such as Unicorn and Nginx. fork creates a copy of the current process, and this child process is attached to its parent process. The two of them share the same socket1.

    # inside the server's class definition
    ...
      def initialize(port, app)
        @server = TCPServer.new(port)
        @app = app
      end
    
      def prefork(workers)
        workers.times do
          fork do
            start
          end
        end
        Process.waitall
      end
    
      def start
        loop do
          socket = @server.accept
          connection = Connection.new(socket, @app)
          connection.process # goes on to process the raw socket data
        end
      end
    ...
    
    server.prefork(5) # for 5 child worker processes
    

    Worker processes are forked beforehand, and all of them share the same listening socket. Whichever process is free will be scheduled by the OS scheduler to handle the next incoming connection on the socket. Presumably, leveraging on the OS scheduler is really efficient.

    Event Loop

    We can simulate an event loop in Ruby using a gem called eventmachine. eventmachine is a feature-packed gem, and comes with helper methods that handle accepting, reading and writing to and from socket connections for us.

    # inside the server's class definition
    ...
      def start_event_machine
        EM.run do
          EM.start_server "localhost", 3000, EMConnection do |conn|
            conn.app = @app
          end
        end
      end
    
      class EMConnection < EM::Connection
        attr_accessor :app
        def post_init
          @parser = Http::Parser.new(self)
        end
        def receive_data(data)
          @parser << data
        end
        ...
      end
    ...
    
    server.start_event_machine
    

    readpartial

    readpartial is an instance method of the IO class in Ruby which allows us to read data off a socket as soon as data is available. The APIDock entry on readpartial elaborates further:

    readpartial is designed for streams such as pipe, socket, tty, etc. It blocks only when no data immediately available. This means that it blocks only when following all conditions hold.

    • the byte buffer in the IO object is empty.

    • the content of the stream is empty.

    • the stream is not reached to EOF.

    Using the readpartial method, we can read off a socket like this:

    data = socket.readpartial(1024) # reads at most 1024 bytes from the I/O stream
    puts data
    
    # do other things with data
    

    sysread is a method with similar functionality.

    http_parser.rb

    http_parser.rb is a gem that wraps around Node’s HTTP parser.

    Rack

    Rack is a set of specifications that web servers, middleware applications, and application frameworks must adhere to. Rack apps must have a single point of entry named call, which must return an array containing the status code, the headers, and the body of the response.

    Things which behave exactly like Rack tells them to (e.g. Unicorn, Rails) are Rack-compliant, and the benefit of this is that Rack-compliant things can be used in conjunction, layered on top of each other, or swapped out and replaced, without each having knowledge of the other (yep, abstraction).

    Noah Gibb’s nice book Rebuilding Rails offers an excellent practical tutorial on Rack. The book covers more than just Rack, but the chapters on Rack are particularly illuminating.

    KIV: Notes on Rebuilding Rails

    Footnotes
    1. More explicitly, the reason why they share the same socket is because of the file descriptor inheritance that happens in fork. According to Linux’s man pages:

      The child inherits copies of the parent’s set of open file descriptors. Each file descriptor in the child refers to the same open file description (see open(2)) as the corresponding file descriptor in the parent. This means that the two descriptors share open file status flags, current file offset, and signal-driven I/O attributes.