Jekyll2024-01-27T20:42:28+00:00/feed.xmlThom’s WeblogNotes about work and other projects
Thom Carternew, make and zero values in Go2021-08-06T00:00:00+00:002021-08-06T00:00:00+00:00/2021/08/06/new-make-and-zero-values-in-go<p><code class="language-plaintext highlighter-rouge">new(T)</code> returns a pointer to a newly allocated zero value of type <code class="language-plaintext highlighter-rouge">T</code>.</p>
<p><code class="language-plaintext highlighter-rouge">make</code> returns an <em>initialised</em> slice, map or channel. For example, a slice which points to an underlying array (an uninitialised slice does not).</p>
<p>In Go, the zero value is a default value assigned to an allocated but uninitialised variable. Each type has its own zero value. For slices and maps, the zero value is <code class="language-plaintext highlighter-rouge">nil</code>.</p>
<p>Once assigned to a variable, <code class="language-plaintext highlighter-rouge">nil</code> is typed and behaves accordingly. You can <code class="language-plaintext highlighter-rouge">append</code> to a <code class="language-plaintext highlighter-rouge">nil</code> slice and look up keys in a <code class="language-plaintext highlighter-rouge">nil</code> map, but not vice versa. You cannot compare <code class="language-plaintext highlighter-rouge">nil</code>s of different types.</p>
<ul>
<li><a href="https://golang.org/doc/effective_go#allocation_new">https://golang.org/doc/effective_go#allocation_new</a></li>
<li><a href="https://golang.org/ref/spec#The_zero_value">https://golang.org/ref/spec#The_zero_value</a></li>
<li><a href="https://blog.golang.org/slices-intro">https://blog.golang.org/slices-intro</a></li>
</ul>Thom Carternew(T) returns a pointer to a newly allocated zero value of type T.Growing buffers in Go2021-02-10T00:00:00+00:002021-02-10T00:00:00+00:00/2021/02/10/growing-buffers-in-go<p>I recently reached for Go’s <code class="language-plaintext highlighter-rouge">ioutil.ReadAll</code> utility function to read some data from a HTTP request body. I’d read that this function should be used with care because it can lead to large values being read into memory, potentially causing crashes. I was curious to find out how that could happen.</p>
<p><a href="https://golang.org/src/io/ioutil/ioutil.go">Looking at the source for <code class="language-plaintext highlighter-rouge">ioutil.Readall</code></a>, we can see that it reads data into a buffer, implemented as a byte slice, with <code class="language-plaintext highlighter-rouge">bytes.Buffer.ReadFrom</code>. <a href="https://golang.org/src/bytes/buffer.go">Reading the source for <code class="language-plaintext highlighter-rouge">ReadFrom</code></a>, we see that it will read data up to the buffer’s capacity and then attempt to grow the buffer, calling the private <code class="language-plaintext highlighter-rouge">grow</code> function.</p>
<p>From the docs for <code class="language-plaintext highlighter-rouge">grow</code>: <code class="language-plaintext highlighter-rouge">if the buffer can't grow it will panic with ErrTooLarge</code>. Within <code class="language-plaintext highlighter-rouge">grow</code> itself, the condition that triggers this panic is <code class="language-plaintext highlighter-rouge">c > maxInt-c-n</code>, where <code class="language-plaintext highlighter-rouge">c</code> is the capacity of the current buffer and <code class="language-plaintext highlighter-rouge">n</code> is the minimum slice size passed to a <code class="language-plaintext highlighter-rouge">Read</code> call by <code class="language-plaintext highlighter-rouge">Buffer.ReadFrom</code>. This is because, assuming re-slicing (creating a new slice referencing the original but using more of its previously allocated capacity) or dropping previously read bytes are not possible, Go will grow the buffer to twice its current capacity, plus <code class="language-plaintext highlighter-rouge">n</code>.</p>
<p><code class="language-plaintext highlighter-rouge">maxInt</code> is defined as <code class="language-plaintext highlighter-rouge">int(^uint(0) >> 1)</code>. Unary <code class="language-plaintext highlighter-rouge">^</code> is the bitwise <code class="language-plaintext highlighter-rouge">NOT</code> operator: it inverts every bit of the unsigned integer <code class="language-plaintext highlighter-rouge">0</code>, yielding the maximum possible unsigned integer. <code class="language-plaintext highlighter-rouge">>></code>, the right shift operator, here shifts bits right by one position, equivalent to dividing by two. So this expression returns the maximum possible signed integer (the leftmost bit being the sign bit).</p>
<p>On a 64-bit architecture then, <code class="language-plaintext highlighter-rouge">grow</code> will refuse to create a slice longer than <code class="language-plaintext highlighter-rouge">9223372036854775807</code>. Practically speaking, we’re always going to exhaust the available memory before reaching that length, so where else can <code class="language-plaintext highlighter-rouge">ErrTooLarge</code> be thrown? The answer is in <code class="language-plaintext highlighter-rouge">makeSlice</code>, called from <code class="language-plaintext highlighter-rouge">grow</code>, which recovers from a panic thrown by <code class="language-plaintext highlighter-rouge">make</code> to throw <code class="language-plaintext highlighter-rouge">ErrTooLarge</code>. <code class="language-plaintext highlighter-rouge">make</code> itself is <a href="https://stackoverflow.com/a/18513087">mapped to a type-specific implementation when compiling</a>, in this case <code class="language-plaintext highlighter-rouge">makeslice</code>, which panics when asked to create a slice larger than the maximum memory allocation.</p>Thom CarterI recently reached for Go’s ioutil.ReadAll utility function to read some data from a HTTP request body. I’d read that this function should be used with care because it can lead to large values being read into memory, potentially causing crashes. I was curious to find out how that could happen.When to reload ActiveRecord objects in tests2021-01-02T00:00:00+00:002021-01-02T00:00:00+00:00/2021/01/02/when-to-reload-activerecord-objects-in-tests<p>A common gotcha in Rails functional testing is asserting against an ActiveRecord object mutated via another instance.
This can easily lead to both false positive and false negative test results.</p>
<p>To ensure accurate results when asserting against an ActiveRecord object, first call <code class="language-plaintext highlighter-rouge">.reload</code> if the record has been updated via another instance in the course of the test. This includes associations, except where these have not previously been loaded from the instance under test.</p>Thom CarterA common gotcha in Rails functional testing is asserting against an ActiveRecord object mutated via another instance. This can easily lead to both false positive and false negative test results.Blocks and yield in Ruby2020-08-12T00:00:00+00:002020-08-12T00:00:00+00:00/2020/08/12/blocks-and-yield-in-ruby<p>I’ve found that the simplest way to think about blocks is as functions that you can pass to methods for execution later.</p>
<p><code class="language-plaintext highlighter-rouge">yield</code> calls a block passed implicitly with the <code class="language-plaintext highlighter-rouge">do ... end</code> or curly brace syntax. Any arguments passed to <code class="language-plaintext highlighter-rouge">yield</code> are passed to the block. You also pass a block explicitly to a method with an ampersand parameter, which allows you to then <code class="language-plaintext highlighter-rouge">call</code> that block. Passing a block like this converts it to a proc.</p>
<p>Procs and lambdas are like blocks, but they can be stored in variables. Both retain the scope within which they were defined (they are closures). Procs and lambdas behave slightly differently: <code class="language-plaintext highlighter-rouge">return</code> in a proc will return from the calling method, whereas a lambda will just return from itself; lambdas enforce arguments, whereas procs do not.</p>
<p>A good reference: <a href="https://blog.appsignal.com/2018/09/04/ruby-magic-closures-in-ruby-blocks-procs-and-lambdas.html">https://blog.appsignal.com/2018/09/04/ruby-magic-closures-in-ruby-blocks-procs-and-lambdas.html</a></p>Thom CarterI’ve found that the simplest way to think about blocks is as functions that you can pass to methods for execution later.ERROR: column … must appear in the GROUP BY clause or be used in an aggregate function2020-01-10T00:00:00+00:002020-01-10T00:00:00+00:00/2020/01/10/error-column-must-appear-in-the-group-by-clause<p>This makes sense, because if we try to <code class="language-plaintext highlighter-rouge">SELECT</code> a column not included in the <code class="language-plaintext highlighter-rouge">GROUP BY</code> clause and there are multiple possible values for that column in a group, the database will not know which value to return.</p>
<p>The SQL standard requires that selected columns be functionally dependent on those in the <code class="language-plaintext highlighter-rouge">GROUP BY</code> clause. This means that a primary key can be used in the <code class="language-plaintext highlighter-rouge">GROUP BY</code> clause in place of a full list of selected columns, where the selected columns are dependent on that key.</p>
<ul>
<li><a href="https://www.postgresql.org/docs/12/sql-select.html#SQL-GROUPBY">https://www.postgresql.org/docs/12/sql-select.html#SQL-GROUPBY</a></li>
<li><a href="https://dev.mysql.com/doc/refman/8.0/en/group-by-handling.html">https://dev.mysql.com/doc/refman/8.0/en/group-by-handling.html</a></li>
</ul>Thom CarterThis makes sense, because if we try to SELECT a column not included in the GROUP BY clause and there are multiple possible values for that column in a group, the database will not know which value to return.CSRF, the same-origin policy and CORS2020-01-03T00:00:00+00:002020-01-03T00:00:00+00:00/2020/01/03/crsf-the-same-origin-policy-and-cors<h2 id="csrf">CSRF</h2>
<p>When a user is authenticated with a web service, an attacker can use cross-site request forgery (CSRF) to trick them into making a request to that service to perform a desired action.</p>
<p>If a service uses <code class="language-plaintext highlighter-rouge">GET</code> requests for state-changing operations, the attacker could present the user with a link that triggers an unwanted update. <code class="language-plaintext highlighter-rouge">POST</code> requests can be triggered with a form submission. Both can be generated without deliberate action from the user, for example by embedding an image tag that links to a protected resource on the target service, or by including some JavaScript that submits a form on a mouseover event. An attacker could include malicious links or code on their own site, or embed either elsewhere, particularly where there is a cross-site scripting vulnerability.</p>
<h3 id="countermeasures">Countermeasures</h3>
<p>HTTP methods should be used as intended. Specifically, use <code class="language-plaintext highlighter-rouge">POST</code> not <code class="language-plaintext highlighter-rouge">GET</code> for state-changing operations. Assuming that this is the case, <code class="language-plaintext highlighter-rouge">GET</code> requests can be considered safe: even if a request is unintended, its response is just returned harmlessly to the user’s browser.</p>
<p><code class="language-plaintext highlighter-rouge">POST</code> requests can be protected by including a security token, verified by the server, in forms and AJAX requests, <a href="https://guides.rubyonrails.org/security.html#cross-site-request-forgery-csrf">as happens in Ruby on Rails</a>.</p>
<p><code class="language-plaintext highlighter-rouge">GET</code> requests for dynamic JavaScript resources can be vulnerable, as an attacker could include such a script in a <code class="language-plaintext highlighter-rouge"><script></code> tag on their own page, and potentially extract user data after the script has executed (e.g. from a <a href="https://www.usenix.org/system/files/conference/usenixsecurity15/sec15-paper-lekies.pdf">global variable</a>). <a href="https://api.rubyonrails.org/classes/ActionController/RequestForgeryProtection.html">Rails prevents embedding of JavaScript responses</a> by default for this reason.</p>
<h2 id="the-same-origin-policy">The same-origin policy</h2>
<p>Whilst an attacker cannot use CSRF to steal data returned from a <code class="language-plaintext highlighter-rouge">GET</code> request, they could theoretically do so by executing a script that makes an AJAX request to a third-party service with the user’s credentials and then forwards the response. Happily, browsers enforce the <a href="https://developer.mozilla.org/en-US/docs/Web/Security/Same-origin_policy">same-origin policy</a> to protect against this.</p>
<p>The origin of a web page is generally the scheme, host and port taken together. Requests to a different origin than that of the requesting page are known as cross-origin requests. When a browser makes a cross-origin AJAX request, the same-origin policy ensures that it will raise an error and not share the response with the calling code. The policy is intended to isolate content from different sources within the browser, and prevents AJAX requests from being used to read a user’s data from a third-party service.</p>
<p>‘Simple’ cross-origin requests (<code class="language-plaintext highlighter-rouge">GET</code>, <code class="language-plaintext highlighter-rouge">HEAD</code> and <code class="language-plaintext highlighter-rouge">POST</code>, with certain content types) are attempted without preceding checks. Other requests (e.g. <code class="language-plaintext highlighter-rouge">POST</code> with JSON or XML content, <code class="language-plaintext highlighter-rouge">PUT</code>, <code class="language-plaintext highlighter-rouge">DELETE</code>) are subject to a preflight <code class="language-plaintext highlighter-rouge">OPTIONS</code> request to determine whether or not the request is safe to send. This is mostly to <a href="https://stackoverflow.com/questions/15381105/cors-what-is-the-motivation-behind-introducing-preflight-requests">protect older servers</a> that might not be aware that browsers are now able to perform these cross-origin requests. Note that it is still necessary to take measures to protect against cross-origin <code class="language-plaintext highlighter-rouge">POST</code> requests made via AJAX, which have always been possible.</p>
<h3 id="cors">CORS</h3>
<p><a href="https://developer.mozilla.org/en-US/docs/Web/HTTP/CORS">Cross-Origin Resource Sharing (CORS)</a> is a mechanism that allows site owners to selectively relax the same-origin policy by configuring their web server to set certain headers. Setting the <code class="language-plaintext highlighter-rouge">Access-Control-Allow-Origin</code> header to a given origin, for example, will enable the browser to make cross-origin requests from this origin.</p>
<p>For credentials (cookies) to be used in such requests, they must be explicitly included in the request itself, and the <code class="language-plaintext highlighter-rouge">Access-Control-Allow-Credentials</code> header must be set.</p>Thom CarterCSRFCharles Booth’s London2017-05-26T00:00:00+00:002017-05-26T00:00:00+00:00/2017/05/26/rebuilding-charles-booths-london<p>I spent a large part of last year working on <a href="https://booth.lse.ac.uk">Charles Booth’s London</a>. The site provides access to digitised content from a nineteenth-century study of poverty in London, including an interactive map of the city.</p>
<p>Back in January, I published <a href="https://medium.com/@tjvc/re-building-charles-booths-london-f042ed23a32a">a write-up of the technical work involved</a> on Medium.</p>Thom CarterI spent a large part of last year working on Charles Booth’s London. The site provides access to digitised content from a nineteenth-century study of poverty in London, including an interactive map of the city.How to use ZNC with a domain name over SSL2017-04-23T00:00:00+00:002017-04-23T00:00:00+00:00/2017/04/23/use-znc-with-a-domain-name-and-SSL<p><a href="http://wiki.znc.in/">ZNC</a> is an IRC bouncer, a service that connects to an IRC server and relays messages between that server and your IRC client. ZNC offers <a href="http://wiki.znc.in/Introduction">several benefits</a>: it can maintain a connection to an IRC server after a client disconnects and buffer messages sent while the client is disconnected.</p>
<p>I’ve been running ZNC on a VPS for a while, using a self-signed SSL certificate for encrypted connections. I thought it would be nice to use a subdomain to access the web interface and for IRC connections, which in turn would allow me to use a verified SSL certificate from <a href="https://letsencrypt.org/">Let’s Encrypt</a>. This is how I did it.</p>
<p>This guide assumes that you already have a server running ZNC (<a href="https://www.vultr.com/docs/install-and-setup-znc-on-ubuntu">this is a good guide to getting set up</a>), and a domain name pointing to that server. I used the Apache web server running on Ubuntu to enable access to the ZNC web interface via my domain name on ports 80 and 443, and the examples below reflect that. If your server’s behind a firewall, you’ll also want to make sure that you’ve opened ports 80, 443 and 6697.</p>
<h2 id="configure-znc">Configure ZNC</h2>
<p>Firstly, we need to configure ZNC to listen for connections to the web interface and IRC. We configure these connections separately, because we want to use SSL for IRC connections (which are made directly to the server), but not for connections to the web interface (which we will make via an internal reverse proxy). We make the changes on the global settings page of the web admin interface; the relevant section should look like this:</p>
<p><img src="/assets/znc-web-interface.png" alt="ZNC web interface" class="center-image" /></p>
<p>The first line is for IRC connections, the second for connections to the web interface. You may have already configured ZNC to listen for connections to the web interface on a different port. If so, you can leave this entry for now and remove it later.</p>
<h2 id="ssl">SSL</h2>
<p>Next we need to obtain an SSL certificate from Let’s Encrypt, and install it on our server. Thankfully, the <a href="https://www.eff.org/">Electronic Frontier Foundation</a> has provided the <a href="https://certbot.eff.org/">Certbot</a> utility, which automates this process.</p>
<p>You can use Certbot to generate a certificate to install manually, but the tool can also handle the configuration of several common software combinations, including, in my case, Apache and Ubuntu. So for me the first step was to install Certbot:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$ sudo apt-get install software-properties-common
$ sudo add-apt-repository ppa:certbot/certbot
$ sudo apt-get update
$ sudo apt-get install python-certbot-apache
</code></pre></div></div>
<p>And after that, I just had to run the tool with the <code class="language-plaintext highlighter-rouge">--apache</code> option:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$ sudo certbot --apache
</code></pre></div></div>
<p>Certbot will then ask you to specify which domain name you want to issue a certificate for, to provide a contact email address, and to choose whether you want to enable both HTTP and HTTPS access or to redirect all requests to HTTPS (I’d recommend the latter). This will create and enable an Apache configuration file (<code class="language-plaintext highlighter-rouge">/etc/apache2/sites-available/000-default-le-ssl.conf</code>) with the required SSL configuration directives. If you choose to redirect all requests to HTTPS, Certbot will also update your default Apache configuration file (<code class="language-plaintext highlighter-rouge">/etc/apache2/sites-available/000-default.conf</code>) to enable this.</p>
<p>Once the certificate files have been installed for Apache, we need to concatenate them in a single file in the ZNC directory (which, for me, is in the home directory of the <code class="language-plaintext highlighter-rouge">znc</code> user).</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$ sudo cat /etc/letsencrypt/live/example.org/privkey.pem /etc/letsencrypt/live/example.org/cert.pem /etc/letsencrypt/live/example.org/chain.pem > /home/znc/.znc/znc.pem
</code></pre></div></div>
<p>At this point you should be able to make encrypted IRC connections to your ZNC server on port 6697.</p>
<h3 id="certificate-renewal">Certificate renewal</h3>
<p>Let’s Encrypt certificates expire after 90 days, so we need to configure our server to automatically renew the certificate. To do that, we first open the root crontab file for editing:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$ sudo crontab -e
</code></pre></div></div>
<p>And add the following line:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>@monthly certbot renew --post-hook "cat /etc/letsencrypt/live/example.org/privkey.pem /etc/letsencrypt/live/example.org/cert.pem /etc/letsencrypt/live/example.org/chain.pem > /home/znc/.znc/znc.pem"
</code></pre></div></div>
<p>Now Certbot will renew the certificate, if required, once a month, and create a new <code class="language-plaintext highlighter-rouge">.pem</code> file in the ZNC directory using the new certificate files (the <code class="language-plaintext highlighter-rouge">--post-hook</code> option specifies a command to be run after certificate renewal).</p>
<h2 id="set-up-a-reverse-proxy">Set up a reverse proxy</h2>
<p>To make the ZNC web interface available at a given domain name, we need to set up a <a href="https://httpd.apache.org/docs/2.4/mod/mod_proxy.html#forwardreverse">reverse proxy</a>. This will direct HTTP(S) requests to the server on ports 80 and 443 to the ZNC web interface running on port 8080. On my server, I used Apache to set up a reverse proxy. To do this, we first have to enable the <code class="language-plaintext highlighter-rouge">mod_proxy</code> and <code class="language-plaintext highlighter-rouge">mod_proxy_http</code> modules:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$ sudo a2enmod proxy proxy_http
</code></pre></div></div>
<p>With these modules enabled, we just need to add the following lines to the Apache configuration file(s). If you’re redirecting HTTP requests to HTTPS, just add them to the SSL configuration file added by Certbot. Otherwise, add them to both this file and the default Apache configuration file.</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>ProxyPass "/" "http://127.0.0.1:8080/"
ProxyPassReverse "/" "http://127.0.0.1:8080/"
</code></pre></div></div>
<p>And that’s it, you should now be able to use your domain name to make encrypted connections to ZNC’s IRC server and web interface!</p>Thom CarterZNC is an IRC bouncer, a service that connects to an IRC server and relays messages between that server and your IRC client. ZNC offers several benefits: it can maintain a connection to an IRC server after a client disconnects and buffer messages sent while the client is disconnected.