<?xml version="1.0" encoding="utf-8"?>
<feed xmlns="http://www.w3.org/2005/Atom">
    <title><![CDATA[davedevelopment.co.uk]]></title>
    <link href="https://davedevelopment.co.uk/atom.xml" rel="self"/>
    <link href="https://davedevelopment.co.uk/"/>
    <updated>2020-08-25T20:45:49+00:00</updated>
    <id>https://davedevelopment.co.uk/</id>
        <generator uri="http://sculpin.io/">Sculpin</generator>
            <entry>
            <title type="html"><![CDATA[In defence of Mocking Frameworks]]></title>
            <link href="https://davedevelopment.co.uk/2020/05/14/in-defence-of-mocking-frameworks.html"/>
            <updated>2020-05-14T00:00:00+00:00</updated>
            <id>https://davedevelopment.co.uk/2020/05/14/in-defence-of-mocking-frameworks.html</id>
            <content type="html"><![CDATA[<p>Recently Frank de Jonge published a blog post on <a href="https://blog.frankdejonge.nl/testing-without-mocking-frameworks/">Testing without mocking
frameworks</a>,
it's a good read and a well thought out post, but there are a few things I
disagree with. This post will be my commentary, you might want to read Frank's
post first.</p>

<p>I do have some skin in the game here, I'm one of the maintainers of
<a href="https://github.com/mockery/mockery">Mockery</a>, the mocking framework Frank uses
in his examples. I maintain it because I use it a lot, but it doesn't provide
any income etc, so my defence is purely theoretical. I won't lose my livelyhood
if people stop using Mockery.</p>

<p>NB: When I use the word "mock" in this post, I usually use it in the vague way
of describing any kind of test double, rather than the by the book definition.</p>

<p>I'm going to focus on the examples Frank uses, how I would see it and
were I would go from there.</p>

<pre><code class="php">/**
 * @test
 */
public function testing_something_with_mocks(): void  
{
    $mock = Mockery::mock(ExternalDependency::class);

    $mock-&gt;shouldReceive('oldMethodName')
        -&gt;once()
        -&gt;with('some_argument')
        -&gt;andReturn('some_response');

    $dependingCode = new DependingCode($mock);
    $result = $dependingCode-&gt;performOperation();

    $this-&gt;assertEquals('some_response', $result);
}
</code></pre>

<p>Frank argues that performing a rename method refactoring with his IDE, this test
would be broken. I have no argument against this, when it happens, I fix the
tests manually.</p>

<p>In more recent versions of Mockery, we can remove some of the stringiness, but I
still don't think it would make the IDE's any more capable of refactoring
without some sort of dyanmic return types for the <code>Mock::allows</code> and
<code>Mock::expects</code> methods.</p>

<pre><code class="php">$mock-&gt;allows()
    -&gt;oldMethodName('some_argument')
    -&gt;andReturns('some_response');
</code></pre>

<p>It won't help much with the refactoring, but it looks a lot better and would
give us some extra context when searching the codebase in that we can see it is
a method call, rather than just a string matching the old name.</p>

<p>Frank then goes on to show a more concrete example, with an <code>InvoiceService</code>.
His first iteration of the test looks like this:</p>

<pre><code class="php">use League\Flysystem\Filesystem;  
use PHPUnit\Framework\TestCase;

class TestInvoiceServiceWithMocksTest extends TestCase  
{
    /**
     * @test
     */
    public function invoicing_a_client(): void
    {
        // Need to work around marking this test as risky
        $this-&gt;expectNotToPerformAssertions();

        $mock = Mockery::mock(Filesystem::class);
        $mock-&gt;shouldReceive('write')
            -&gt;once()
            -&gt;with('/invoices/abc/2020/3.txt', '{"client_id":"abc","amount":42,"invoice_period":{"year":2020,"month":3}}');

        $invoicePeriod = InvoicePeriod::fromDateTime(DateTimeImmutable::createFromFormat('!Y-m', '2020-03'));
        $invoiceService = new InvoiceService($mock);
        $invoiceService-&gt;invoiceClient('abc', $invoicePeriod);
    }

    protected function tearDown(): void
    {
        Mockery::close();
    }
}
</code></pre>

<p>First thing I notice is that Frank isn't using Mockery's PHPUnit Integration. We
ship a base <code>MockeryTestCase</code>, or you can use a trait. I've just created a PR to <a href="https://github.com/mockery/mockery/pull/1061">add a
note to the readme about this</a>.</p>

<p>Pulling in the integration makes the <code>tearDown</code> method and telling PHPUnit
to ignore the lack of expectations redundant, so we can remove them.</p>

<pre><code class="php">use MockeryPHPUnitIntegration;

/**
 * @test
 */
public function invoicing_a_client(): void
{
    $mock = Mockery::mock(Filesystem::class);
    $mock-&gt;shouldReceive('write')
        -&gt;once()
        -&gt;with('/invoices/abc/2020/3.txt', '{"client_id":"abc","amount":42,"invoice_period":{"year":2020,"month":3}}');

    $invoicePeriod = InvoicePeriod::fromDateTime(DateTimeImmutable::createFromFormat('!Y-m', '2020-03'));
    $invoiceService = new InvoiceService($mock);
    $invoiceService-&gt;invoiceClient('abc', $invoicePeriod);
}
</code></pre>

<p>Now the PHPUnit awkwardness is out of the way, Frank notes three things:</p>

<ul>
<li>Amount of mocking code vs other code.</li>
<li>High coupling between test and implementation.</li>
<li>The test does not describe desired behavior, it validates implementation.</li>
</ul>

<p>Only the first of those items is specific to using mocking frameworks, the
second two are possible regardless of mocking frameworks.</p>

<p>Frank's first step in improving this test is to replace the generated mock with
a concrete one. This does give him some improved readability with regards to
<a href="https://wiki.c2.com/?ArrangeActAssert">Arrange, Act, Assert</a>, but that is
achievable using a mocking framework too and is the first step I would have
taken.</p>

<pre><code class="php">/**
 * @test
 */
public function invoicing_a_client(): void
{
    // Arrange
    $mock = spy(Filesystem::class);
    $invoiceService = new InvoiceService($mock);

    // Act
    $invoiceService-&gt;invoiceClient(
        'abc'
        InvoicePeriod::fromDateTime(DateTimeImmutable::createFromFormat('!Y-m', '2020-03'))
    );

    // Assert
    $mock-&gt;shouldHaveReceived()
        -&gt;write('/invoices/abc/2020/3.txt', '{"client_id":"abc","amount":42,"invoice_period":{"year":2020,"month":3}}');
}
</code></pre>

<p>This looks very similar to Frank's first revision. I've replaced the mock with a
<a href="/2014/10/09/mockery-spies.html">spy</a>, which allows me to verify that calls were
made after the fact, rather than declaring them up front.</p>

<p>The big difference is Frank's test uses a concrete implementation of a
<code>FileSystem</code>. Frank gains confidence from this, because he knows the
<code>InMemoryFilesystemAdapter</code> runs against the same set of tests as the other more
useful versions. I wouldn't have the same confidence as Frank, because I don't
trust that library. I don't trust most of the libraries I use and I try not to
mock types I don't trust. And by that I mean <em>either</em> mocks generated by a
framework or concrete mocks created by hand.</p>

<p>Both my version and Frank's version still have that big horrible JSON string,
the high coupling between the test and the implementation. Both versions still
don't describe the desired behaviour, they describe what the SUT does and what
the outcome is, with explicit detail.  We both tackle this next, but in a
different way.</p>

<p>The domain experts tell us the <code>Invoice</code> should be <em>submitted</em> to the
<code>InvoicePortal</code>, so I'm going to update my test to describe that exactly. I'm
also going to make sure the test name describes that.</p>

<pre><code class="php">/**
 * @test
 */
public function invoicing_a_client_submits_the_invoice_to_the_invoice_portal(): void
{
    // Arrange
    $invoicePortal = Mockery::mock(InvoicePortal::class);
    $invoiceService = new InvoiceService($invoicePortal);

    // Act
    $invoicePeriod = InvoicePeriod::fromDateTime(DateTimeImmutable::createFromFormat('!Y-m', '2020-03'))
    $invoiceService-&gt;invoiceClient('abc', $invoicePeriod);

    // Assert
    $expectedInvoice = new Invoice('abc', 42, $invoicePeriod);
    $invoicePortal-&gt;shouldHaveReceived()
        -&gt;submitInvoice(equalTo($expectedInvoice));
}
</code></pre>

<p>At this stage, the <code>InvoicePortal</code> doesn't exist. The author's behind the
<a href="http://www.growing-object-oriented-software.com/">GOOS</a> book call this
<em>programming by wishful thinking</em>. I know the <code>InvoiceService</code> needs to submit
an <code>Invoice</code> to an <code>InvoicePortal</code>, so I'm going to describe that in my test and
Mockery facilitates it for me. I then would update the SUT accordingly. The
astute among you will notice that Mockery isn't really helping me <em>test</em> the SUT
here and we will need more tests to actually ensure the system works. What
Mockery is doing, is helping me <em>design</em> the SUT. It has perfectly described
that I will need an <code>InvoicePortal</code>, and that thing will need a <code>submitInvoice</code>
method that takes an <code>Invoice</code>.</p>

<p>I would now move <em>in</em> to the next layer of my application and start building the
<code>InvoicePortal</code>, which will have it's own suite of tests, just like in Frank's
example. One difference being, I have defined my meaningful boundaries already,
by writing an executable example of the way the consumer, would like things to
work. By writing the <code>InvoicePortal</code> first, Frank is defining the meaningful
boundaries in the inner layer, and once he has built them, the outer
layer gets to try them on for size.</p>

<p>In the outside-in methodology, the outer layers <em>describe</em> what is required from
the inner layers, and then the programmer builds them. In the inside-out
methodology, the programmer builds the inner layer which <em>prescribes</em> what is
available and the outer layer works around it. With experienced programmers, the
outcome will most often be the same, just a slightly different journey.</p>

<p>Another difference is that I wouldn't have bothered creating a
<code>FakeInvoicePortal</code>. I already have a fake generated for me by Mockery and
unless I'm going to get a lot of benefit and reuse, I don't take the time to
write concrete fakes by default.</p>

<h2 id="what-did-we-achieve%3F">What did we achieve?</h2>

<p>We still don't have the nice IDE refactoring as mentioned earlier, but I would
argue that my example achieves the same other things Frank achieved with his
refactoring, but with less effort and more preferable (to me) design guidance.</p>

<blockquote>
  <p>The tests no longer contain implementation details.</p>
</blockquote>

<p>We have the same outcome here, there are no more or no less implementation
details in either my test or Frank's. You could argue about the amount of
coupling, Frank's are coupled to his manually crafted mock, mine are coupled to
Mockery's generated mocks.</p>

<blockquote>
  <p>Clearer high-level code.</p>
</blockquote>

<p>This is irrelevant with regards to the tests and therefore mocking frameworks or
not, the code for the SUT is identical for both my way of work and Frank's.</p>

<blockquote>
  <p>Low-complexity fakes are easy to control.</p>
</blockquote>

<p>I personally find Mockery's mocks very easy to control, but I have a bias there.
Mockery ships with the ability to stage exceptions and fix responses and if you
know how to use them, would save you time writing code. That said, it's far
easier to understand and trust code you have written yourself, so I see where
Frank is coming from. I personally trust Mockery enough to lean on it and save
myself the time.</p>

<pre><code class="php">$invoicePortal = Mockery::mock(InvoicePortal::class);
$invoicePortal
    -&gt;allows()
    -&gt;submit(anyArgs())
    -&gt;andThrows(new \Exception());
</code></pre>

<p>Fin.</p>
]]></content>
        </entry>
            <entry>
            <title type="html"><![CDATA[Importing HIBP&#039;s pwned password list in to DynamoDB]]></title>
            <link href="https://davedevelopment.co.uk/2017/08/11/importing-pwned-passwords-in-to-dynamodb.html"/>
            <updated>2017-08-11T00:00:00+00:00</updated>
            <id>https://davedevelopment.co.uk/2017/08/11/importing-pwned-passwords-in-to-dynamodb.html</id>
            <content type="html"><![CDATA[<div class=alert>
<h2>⚠️ Update! </h2>

Troy Hunt recently released V2 of <a href=https://haveibeenpwned.com/Passwords>HIBP Passwords</a> and removed the API rate
limits, so you're probably better off using that service than setting up your
own copy.

</div>

<p>Troy Hunt recently <a href="https://www.troyhunt.com/introducing-306-million-freely-downloadable-pwned-passwords/">introduced HIBP
Passwords</a>,
a freely <a href="https://haveibeenpwned.com/Passwords">downloadable list of over 300 million
passwords</a> that have been pwned in the
various breaches the site records. There is an <a href="https://haveibeenpwned.com/API/v2#PwnedPasswords">API to access the
list</a> for auditing and
checking passwords, but it's rate limited, and I thought it would be more
friendly to import the passwords in to a database we control. It looks like
HIBP uses <a href="https://www.troyhunt.com/working-with-154-million-records-on/">Azure Table
Storage</a> to make
the data quickly accessible, I do most of my work on AWS so I thought I'd take
a look at importing the hashes in to
<a href="https://aws.amazon.com/dynamodb/">DynamoDB</a>. It's relatively cheap to run and easy to
use. I thought it might be useful for others, so here's the rundown.</p>

<p>The first thing I tried was to follow a <a href="http://docs.aws.amazon.com/datapipeline/latest/DeveloperGuide/dp-importexport-ddb-part1.html">tutorial using AWS Data
Pipeline</a>,
which seemed to be exactly what I needed. In the end though, I found the
tutorial made a few assumptions that I either missed or they failed to mention,
mostly around the expected data format for the CSV files. Thankfully though,
this lead me towards using <a href="https://aws.amazon.com/emr/">AWS Elastic Map
Reduce</a> directly and this turned out to be the
winning formula.</p>

<p>To start with, I needed to get the data from HIBP and upload it to S3. I think
this method would have worked if the source files had been gzipped, but I
wasn't too sure about 7zipped, so I decompressed them before uploading them to
S3.</p>

<pre><code>mkdir hibp-passwords
cd hibp-passwords
wget https://downloads.pwnedpasswords.com/passwords/pwned-passwords-1.0.txt.7z
wget https://downloads.pwnedpasswords.com/passwords/pwned-passwords-update-1.txt.7z
wget https://downloads.pwnedpasswords.com/passwords/pwned-passwords-update-2.txt.7z
7zr e pwned-passwords-1.0.txt.7z
7zr e pwned-passwords-update-1.txt.7z
7zr e pwned-passwords-update-2.txt.7z
rm *.7z
aws s3 mb s3://hibp-passwords-123
aws s3 sync ./ s3://hibp-passwords-123
</code></pre>

<p>I created a DynamoDB table to hold the data. It only needs the one column,
which will also be the partition key.</p>

<p><img src="/img/hibp/table.png" alt="Screenshot of table setup" /></p>

<p>Now adjust the write capacity to allow our import to go reasonably
quickly. I set the capacity to 10000 units, which I think was the maximum
without having to asking AWS to lift the limits.</p>

<p><img src="/img/hibp/capacity.png" alt="Screenshot of setting capacity" /></p>

<p>We then need to create our EMR cluster. I experimented with different sizes to
begin with, but settled on a cluster of 16 c4.8xlarge instances. I left the rest of
the settings as the defaults. Between this and the write capacity we set on the
DynamoDB table, this meant the import took around 16 hours. There's probably a
better combination of cluster size and write capacity, but it was good enough
for me. I should point out here that this isn't a cheap way to do this, I think
the work of this EMR cluster will come to around $400. I imagine someone who
knows what they are doing with Hadoop/Hive/etc, could do this much more
efficiently.</p>

<p><img src="/img/hibp/nodes.png" alt="Screenshot of node settings" /></p>

<p>Now we're all good to go. Log in to the EMR master node, install tmux and fire up <a href="https://hive.apache.org/">Hive</a>.</p>

<pre><code>ssh -i ~/.ssh/your-key.pem hadoop@your-master-node-public-url
sudo yum install tmux
tmux
hive
</code></pre>

<p>We need to tell Hive to use as much DynamoDB write capacity as it sees fit, no
other resources are currently trying to access the table.</p>

<pre><code>&gt; SET dynamodb.throughput.write.percent=1.5;
</code></pre>

<p>We then create our first Hive table, by telling it where to find our data.</p>

<pre><code>&gt; CREATE EXTERNAL TABLE s3_hibp_passwords(a_col string)
    ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' 
    LOCATION 's3://hibp-passwords-123/';
OK
Time taken: 3.371 seconds
</code></pre>

<p>The first time I tried running the import, I ran in to trouble with it trying to enter
duplicate items in to the DynamoDB table. A bit of searching lead me to a
stackoverflow question and a means to create another table, but with only
unique hashes.</p>

<pre><code>&gt; CREATE TABLE s3_hibp_passwords_dedup AS
    SELECT a_col
    FROM (SELECT a_col, rank() OVER
            (PARTITION BY a_col)
            AS col_rank FROM s3_hibp_passwords) t
    WHERE t.col_rank = 1
    GROUP BY a_col;
Query ID = hadoop_20170810144127_818de1ea-0c47-4a16-a5bf-5b77e3336d69
Total jobs = 1
Launching Job 1 out of 1
Tez session was closed. Reopening...
Session re-established.


Status: Running (Executing on YARN cluster with App id application_1502375367082_0202)

----------------------------------------------------------------------------------------------
        VERTICES      MODE        STATUS  TOTAL  COMPLETED  RUNNING  PENDING  FAILED  KILLED
----------------------------------------------------------------------------------------------
Map 1 .......... container     SUCCEEDED    257        257        0        0       0       0
Reducer 2 ...... container     SUCCEEDED     53         53        0        0       0       0
Reducer 3 ...... container     SUCCEEDED     27         27        0        0       0       0
----------------------------------------------------------------------------------------------
VERTICES: 03/03  [==========================&gt;&gt;] 100%  ELAPSED TIME: 105.63 s
----------------------------------------------------------------------------------------------
Moving data to directory hdfs://ip-172-31-43-51.eu-west-1.compute.internal:8020/user/hive/warehouse/s3_hibp_passwords_dedup
OK
Time taken: 112.395 seconds
</code></pre>

<p>Now we need to create another hive table, this one will be backed by our DynamoDB table.</p>

<pre><code>&gt; CREATE EXTERNAL TABLE ddb_hibp_passwords (col1 string)
    STORED BY 'org.apache.hadoop.hive.dynamodb.DynamoDBStorageHandler' 
    TBLPROPERTIES ("dynamodb.table.name" = "pwned_passwords", "dynamodb.column.mapping" = "col1:sha1");  
OK
Time taken: 0.925 seconds
</code></pre>

<p>That's it, we're all ready to go. This is going to take some time, so set it running and go to bed.</p>

<pre><code><br />&gt; INSERT OVERWRITE TABLE ddb_hibp_passwords 
    SELECT * FROM s3_hibp_passwords_dedup;
Query ID = hadoop_20170810144407_872afa4b-ca88-4b57-a284-5496fcb0d30f
Total jobs = 1
Launching Job 1 out of 1


Status: Running (Executing on YARN cluster with App id application_1502375367082_0002)

----------------------------------------------------------------------------------------------
        VERTICES      MODE        STATUS  TOTAL  COMPLETED  RUNNING  PENDING  FAILED  KILLED
----------------------------------------------------------------------------------------------
Map 1 .......... container     SUCCEEDED    254        254        0        0      18       0
----------------------------------------------------------------------------------------------
VERTICES: 01/01  [==========================&gt;&gt;] 100%  ELAPSED TIME: 59056.94 s
----------------------------------------------------------------------------------------------
OK
Time taken: 59058.426 seconds
hive&gt;

</code></pre>

<p>I don't know what happened with the failures, I didn't bother checking. I know
I ended up with 319,169,449 hashes in the database, so good enough for me.</p>

<p><img src="/img/hibp/items.png" alt="Screenshot of items" /></p>

<p>Back in the terminal, we can quickly check to see how this is going to work.
Note that I shifted the hash to uppercase, the HIBP source data hashes are all
uppercase.</p>

<pre><code>davem@wes:~$ SHA1DAVE=$(echo -n dave | sha1sum | tr "[a-z]" "[A-Z]" | awk '{ print $1 }')
davem@wes:~$ echo $SHA1DAVE
BFCDF3E6CA6CEF45543BFBB57509C92AEC9A39FB
davem@wes:~$ aws dynamodb get-item --table-name hibp_passwords --key "{\"sha1\": {\"S\": \"$SHA1DAVE\"}}"
{
    "Item": {
        "sha1": {
            "S": "BFCDF3E6CA6CEF45543BFBB57509C92AEC9A39FB"
        }
    }
}
real 0.46
user 0.30
sys 0.04
davem@wes:~$
</code></pre>

<p>Have fun auditing your passwords.</p>
]]></content>
        </entry>
            <entry>
            <title type="html"><![CDATA[Using closures as PHPUnit After Hooks]]></title>
            <link href="https://davedevelopment.co.uk/2016/12/23/using-closures-as-phpunit-after-hooks.html"/>
            <updated>2016-12-23T00:00:00+00:00</updated>
            <id>https://davedevelopment.co.uk/2016/12/23/using-closures-as-phpunit-after-hooks.html</id>
            <content type="html"><![CDATA[<p>Not sure why I didn't start doing this sooner.  We have a basic Feature toggle
system that is maintained in the global scope to make it easily accessible to
any part of the code:</p>

<pre><code class="php">Feature::isEnabled("something_awesome");
</code></pre>

<p>I needed to force a particular feature on in a PHPUnit integration test,
but in order to tidy up after myself, I would need to ensure that the test reset
the Feature system after it had finished. There are a few ways of
doing this.</p>

<p>The first is to consider using PHPUnit's built in features for <a href="https://phpunit.de/manual/current/en/fixtures.html#fixtures.global-state">preserving
global state</a> .
I've never really used them, but I have seen plenty of problems caused by them,
so I didn't even bother looking.</p>

<p>Another option is to wrap the test execution in a try/catch/finally block and ensure the global
state is reset regardless of what the test code does:</p>

<pre><code class="php">    /** @test */
    public function single_nasty_test_with_global_scope_changes()
    {
        GlobalScope::$thing = true;

        try {
            // exercise system that uses GlobalScope::$thing
            $this-&gt;assertTrue(GlobalScope::$thing);
        } finally{
            GlobalScope::$thing = false;
        }
    }
</code></pre>

<p>This isn't too bad, but I'd prefer to avoid the indentation and would look a lot
messier if there were several lines of test code.</p>

<p>Another option, you could add the reset code to your <code>tearDown</code> method or add a totally new <code>@after</code> method.</p>

<pre><code class="php">    /** @after */
    public function reset_global_state()
    {
        GlobalScope::$thing = false;
    }

    /** @test */
    public function single_nasty_test_with_global_scope_changes()
    {
        GlobalScope::$thing = true;

        // exercise system that uses GlobalScope::$thing
        $this-&gt;assertTrue(GlobalScope::$thing);
    }
</code></pre>

<p>This isn't too bad again, but it gets a bit lost for me, it applies to every
test method in the class (which would be a dozen or so in my instance), rather
than the one test method that needs it. It's also separated from the test
method, so it's not immediately clear if the global scope is being reset
accordingly.</p>

<p>None of these really appealed to me, so I had a quick look inside PHPUnit to see
if it had anything that would allow me to set it up contextually, right next to
that particular test methods setup. There wasn't that I could see, but it didn't
take two minutes to write this little trait:</p>

<pre><code class="php">&lt;?php

trait AfterHooks
{
    private $afterHooks = [];

    public function after(callable $callback)
    {
        $this-&gt;afterHooks[] = $callback;
    }

    /**
     * @after
     */
    public function runAfterHooks()
    {
        $afterHooks = $this-&gt;afterHooks;
        $this-&gt;afterHooks = [];

        foreach ($afterHooks as $afterHook) {
            $afterHook();
        }
    }
}
</code></pre>

<p>And here is how you use it:</p>

<pre><code class="php"><br />    use AfterHooks;

    /** @test */
    public function some_nasty_test_with_global_scope_changes()
    {
        $this-&gt;after(function () { GlobalScope::$thing = false; });

        GlobalScope::$thing = true;

        // exercise system that uses GlobalScope::$thing
        $this-&gt;assertTrue(GlobalScope::$thing);
    }

    /** @test */
    public function another_test()
    {
        // exercise system that uses GlobalScope::$thing
        $this-&gt;assertFalse(GlobalScope::$thing);
    }

</code></pre>

<p>Doesn't have to be global scope, you can use it to tear down anything that's
particularly relevant to a specific test. You are welcome.</p>
]]></content>
        </entry>
            <entry>
            <title type="html"><![CDATA[Faster Tests in PHP: Selectively running tests]]></title>
            <link href="https://davedevelopment.co.uk/2016/11/22/faster-tests-in-php-selectively-running-tests.html"/>
            <updated>2016-11-22T00:00:00+00:00</updated>
            <id>https://davedevelopment.co.uk/2016/11/22/faster-tests-in-php-selectively-running-tests.html</id>
            <content type="html"><![CDATA[<p>I previously talked about <a href="http://davedevelopment.co.uk/2016/11/16/faster-tests-in-php-organising-test-suites.html">organising your test
suites</a>
and one of the great benefits you get from doing so is that you are then able to
run your tests in many different ways and configurations, so that the right
tests get run at the right time.</p>

<h1 id="running-specific-tests-and-test-classes">Running specific tests and test classes</h1>

<p>As I am writing code, I'm usually practicing TDD and writing my tests before
writing and refactoring code. This usually means I have a test class open and at
least the corresponding code class open in splits on the screen. I move back and
forth between the splits, writing test code, production code and refactoring.
If I were to run the whole test suite every time I've finished writing
something, it would make the feedback loop too slow for me on even a medium
sized project.  As I write a test, if the whole set of tests in the class run
very quickly, I might run them all by asking PHPUnit to run 
that test file, allowing me to quickly check the correctness of the test I'm
working on, that it's failing and that the production code I write then makes it
pass.</p>

<pre><code>phpunit tests/integration/App/Controller/AccountControllerTest.php
</code></pre>

<p>However, usually I prefer to run just that single test that I am working on
right now. PHPUnit has a <code>filter</code> switch, which makes this possible, though it
can be a little fiddly.</p>

<pre><code>phpunit --filter test_it_should_deny_unauthenticated_requests tests/integration/App/Controller/AccountControllerTest.php
</code></pre>

<p>I'll then run the full test class <em>after</em> I've finished refactoring, ensuring that
I haven't affected any related code.</p>

<h2 id="vim-bindings">Vim Bindings</h2>

<p>In order to facilitate this, I use a set of vim key bindings, mostly inspired by
watching Gary Bernhardt's <a href="https://www.destroyallsoftware.com/screencasts">Destroy all
Software</a>.</p>

<p><code>&lt;leader&gt;t</code> If I am in a PHPUnit test file, save and run the file. If I am not in a <em>test
  file</em>, save the file I am in and run the last test file I ran.</p>

<p><code>&lt;leader&gt;s</code> If I am in a PHPUnit test file, save the file and run the <em>test
  case</em> the cursor is in. If I'm not in a test file, save the current file and
  run the last test case I ran.</p>

<p><img src="/img/vim-tdd.gif" alt="vim bindings" /></p>

<p>I love how quick and easy this makes running the tests and because I use
terminal vim, I run them right there using bang(!) and my attention doesn't
leave the current window. Hitting enter dismisses the results.</p>

<p>I also have a few other key bindings set up, though I don't use them all that
often.</p>

<p><code>&lt;leader&gt;Tc</code> Run PHPUnit with code coverage on the current test file, or the
  last test file if looking at production code.</p>

<p><code>&lt;leader&gt;Td</code> Run PHPUnit with TestDox output on the current file or the last
  run test file if looking at production code.</p>

<p><code>&lt;leader&gt;T</code> Run PHPUnit without any arguments, so the default test suite as
  per the <code>phpunit.xml</code> configuration.</p>

<p>There are a whole host of ways to set this kind of thing up with Vim. Some
people like to have Vim send commands to a tmux pane and have the results show
there.</p>

<p>Whatever your favourite editor is, it might be worth seeing what sort of support
they have for running tests.</p>

<h1 id="running-with-filesystem-watchers">Running with filesystem watchers</h1>

<p>It's always my preference to run the tests whenever
I see fit, but if your editor doesn't make it easy, or if it doesn't suit your
workflow, perhaps some of the file watcher setups could be right for you.</p>

<p>These tools watch the filesystem and run PHPUnit for you whenever a file is
updated. I think most people have them running in a window on screen, but I know
some of them show you pass/fail notifications via growl and such.</p>

<p>I think one of
the most well known tools is Ruby's combination of <a href="https://github.com/guard/guard-rspec">guard and
rspec</a>, but Eric Hogue has a post on how
he <a href="http://erichogue.ca/2012/09/php/continuous-testing-in-php-with-guard/">uses guard with his PHPUnit
tests</a>.
There are loads of other tools, a quick search shows there are a bunch of
packages available for grunt and gulp, I'm sure one of them works. If you use
one and really like it, comment below or <a href="https://twitter.com/davedevelopment">tweet at
me</a> and I'll update this post with details.</p>

<h1 id="groups%2Fsuites">Groups/Suites</h1>

<p>Moving on from the inner TDD loop, having your test suite
<a href="http://davedevelopment.co.uk/2016/11/16/faster-tests-in-php-organising-test-suites.html">organised</a>
into directories, suites or groups can be really beneficial for running on
workstations. Sometimes we have tests that are slow to run or simply can't be
run due to hardware, operating system, licencing or networking, so making sure
developers can run a reasonable set of tests before commiting or merging code is
essential.  Giving the developers a choice of different runs can make the
feedback loop more pleasant, meaning the tests get run more often, but not so
often that you're wasting developer time and CPU cycles.</p>

<h1 id="continuous-integration">Continuous Integration</h1>

<p>Here is the place to make sure <em>all</em> of the tests get run at <em>some point</em>. My
preference would be for all tests to run before code is released. However, if
you have some tests that are too hard or to slow to run before every release
(assuming you've done everything you can to fix those tests), and the
risk is acceptable, having those tests run in a different CI task is a good idea,
allowing your release cycle to move quickly.</p>

<p>As mentioned in the <a href="http://davedevelopment.co.uk/2016/11/16/faster-tests-in-php-organising-test-suites.html">organising your test
suite</a>
post, being able to run your fastest tests first can help the test runs fail
fast, meaning you get to work fixing the code faster.</p>

<p>Another quick tip, unrelated to testing itself, if you run any code coverage or static analysis on the
code purely for metrics, have that run separately from the main release build,
either automatically or on demand. These things take time and if a drop in code
coverage or an increase in cyclomatic complexity wouldn't stop you from
releasing code, make it part of a separate build/task.</p>

<h1 id="faster-tests-in-php">Faster tests in PHP</h1>

<p>This is the third in a series about keeping your test runs fast, check back for
more or leave your email address below to get notified when a new post goes up.</p>

<ol>
<li><a href="http://davedevelopment.co.uk/2016/11/08/faster-tests-in-php-avoiding-latency-with-fakes.html">Avoiding latency with Fakes</a></li>
<li><a href="http://davedevelopment.co.uk/2016/11/16/faster-tests-in-php-organising-test-suites.html">Organising Test Suites</a></li>
<li><a href="http://davedevelopment.co.uk/2016/11/23/faster-tests-in-php-selectively-running-tests.html">Selectively Running Tests</a></li>
</ol>
]]></content>
        </entry>
            <entry>
            <title type="html"><![CDATA[Faster Tests in PHP: Organising Test Suites]]></title>
            <link href="https://davedevelopment.co.uk/2016/11/16/faster-tests-in-php-organising-test-suites.html"/>
            <updated>2016-11-16T00:00:00+00:00</updated>
            <id>https://davedevelopment.co.uk/2016/11/16/faster-tests-in-php-organising-test-suites.html</id>
            <content type="html"><![CDATA[<p>One way of keeping your test <em>suites</em> running fast is by organising them in a way that
allows you to run the right tests at the right time. This might be running the
faster, isolated tests to give you instant feedback in your TDD loop, or it
might be running the most critical acceptance tests before you commit a
changeset.</p>

<p>PHPUnit offers a number of ways to organise your test suite, but the
<a href="https://phpunit.de/manual/current/en/organizing-tests.html">docs</a> are a
little light on commentary, these are my thoughts.</p>

<h1 id="using-the-filesystem">Using the filesystem</h1>

<p>The first and easiest way to get started is to use the filesystem. If you give
PHPUnit a directory as an argument, it will scan that directory (recursively)
for <code>*Test.php</code> files and then execute them. This means we can start to split
our test suite up, just by putting files in different places.</p>

<p>For the main Childcare.co.uk app's test suite, I have two top level PHPUnit
directories.</p>

<p>The first is <code>tests/unit</code> and this is for isolated, fast, unit
tests. These tests tend not to interact with any external systems, quite often
not interacting with other classes, functions or components at all. They take
just a few milliseconds each to run and there are quite a lot of them.</p>

<p>The second directory is called <code>tests/integration</code> and this directory gets
everything else. I try not to get too caught up in naming the types of tests I
write, but depending on your way of thinking, this directory includes unit,
functional, integration, integrated, system and acceptance tests. The tests in here
exercise larger parts of the code and usually interact with third party systems
like databases or HTTP APIs. In a lot of cases, I use <a href="http://davedevelopment.co.uk/2016/11/08/faster-tests-in-php-avoiding-latency-with-fakes.html">fakes to speed up these
tests</a>,
but they're still slow, taking a couple of minutes to run the entire directory.</p>

<p>Under these directories, I tend to follow the file structure of the production
code as close as possible, though that quite often doesn't apply to the
acceptance tests that I run out of <code>tests/integration</code>, which tend to be
something like <code>tests/integration/src/Childcare/&lt;module&gt;/NameOfFeature.php</code>. If
I were starting green field today, I might have a separate top level directory
for some of those, but I don't lose sleep over it.  Either way, having the tests
with a decent hierachy allows for flexibility in running just the tests for a
section/module/component as needed.</p>

<h1 id="failing-fast">Failing fast</h1>

<p>Given this simple split of faster tests and slower tests, I am able to run the
faster tests first, followed by the slow tests. In theory, this should mean I'm
getting the feedback I need quicker. To faciliate this, I use a simple
<code>Makefile</code> as a task runner.</p>

<pre><code class="Makefile">.PHONY: check

check: 
    vendor/bin/phpunit tests/unit
    vendor/bin/phpunit tests/integration
</code></pre>

<h1 id="using-groups">Using groups</h1>

<p>Even given this separation, I still find I need some further organisation. Some
of those tests in <code>tests/integration</code> are really slow. I have a bunch of code
that interacts with the Facebook API and to make sure my code keeps in tune with
the API responses, I really want to have some tests that actually hit the API.
It turns out that creating test user accounts is quite slow and this slows my
tests down significantly, to the point where I start to get annoyed when I'm
running tests before a merge.  Rather than creating another top level directory
to house the dozen or so tests that create Facebook users, I decided to use
PHPUnit's <code>group</code> feature to allow me to exclude these tests as I please.</p>

<pre><code class="php"><br />/**
 * @test
 * @group facebook
 */
public function some_expensive_test() {}
</code></pre>

<p>Again, using our <code>Makefile</code> as a task runner, I make a recipe that runs the
tests without those facebook tests.</p>

<pre><code class="Makefile">.PHONY: check check-quick

check-quick: 
    vendor/bin/phpunit tests/unit
    vendor/bin/phpunit --stop-on-failure --exclude-group=facebook tests/integration

</code></pre>

<p>I have a few other groups excluded and I use this run regularly throughout the day,
only running the full suite occasionally. Our continuos integration
server always runs the full suite when I push code, so I can carry on developing
carefree with regards to our Facebook integration. Because I use this for rapid
feedback, I also use the <code>--stop-on-failure</code> switch. This can sometimes be
annoying if you've broken code in several places, but more times than not the
first failing test allows me to identify a smaller set of tests to run, make
fixes and run again.</p>

<h1 id="using-%40small%2C-%40medium-and-%40large">Using @small, @medium and @large</h1>

<p>One alternative to separating faster and slower tests in to directories is to
use the special <code>@small</code>, <code>@medium</code> and <code>@large</code> annotations. These are aliases
for <code>@group small</code> etc, but also have special meaning if you install the
<code>phpunit/php-invoker</code> package. With this package installed, if you run PHPUnit
with
<a href="https://phpunit.de/manual/current/en/risky-tests.html#risky-tests.test-execution-timeout">--enforce-time-limit</a>,
PHPUnit will mark these tests as risky if they do not execute within a set of
time thresholds.</p>

<h1 id="using-test-suites">Using test suites</h1>

<p>Another way to organise the PHPUnit tests is to use <a href="https://phpunit.de/manual/current/en/appendixes.configuration.html#appendixes.configuration.testsuites">test
suites</a>. This isn't
something I currently utilise, but they are quite flexible and can mimic the
behaviour I achieve using the Makefiles.</p>

<pre><code class="xml">&lt;phpunit bootstrap="src/autoload.php"&gt;
  &lt;testsuites&gt;
    &lt;testsuite name="all"&gt;
      &lt;directory&gt;tests/unit&lt;/directory&gt;
      &lt;directory&gt;tests/integration&lt;/directory&gt;
    &lt;/testsuite&gt;
  &lt;/testsuites&gt;
&lt;/phpunit&gt;
</code></pre>

<p>The first thing to note is that when using a test suite, PHPUnit will run the
tests in the order specified. This means with the example above, we get that
failing fast scenario as our faster <code>tests/unit</code> tests get run before the slow
<code>tests/integration</code> tests.</p>

<p>You can also get fairly specific about the files you want to include or even
exclude, so depending on the way you set your files and directories up, you can
achieve something like I did above with the <code>facebook</code> group.</p>

<pre><code class="xml">    &lt;testsuite name="quick"&gt;
      &lt;directory&gt;tests/unit&lt;/directory&gt;
      &lt;directory&gt;tests/integration&lt;/directory&gt;
      &lt;exclude&gt;tests/integration/facebook&lt;/exclude&gt;
    &lt;/testsuite&gt;
</code></pre>

<p>I tend not to use test suites as I find the combination of directories and
groups to be enough.</p>

<p>I plan to follow up on this post with a more detailed look at which tests to run
at what time, drop your email in the box below to be notified when that one
lands.</p>

<h1 id="faster-tests-in-php">Faster Tests in PHP</h1>

<ol>
<li><a href="http://davedevelopment.co.uk/2016/11/08/faster-tests-in-php-avoiding-latency-with-fakes.html">Avoiding latency with Fakes</a></li>
<li><a href="http://davedevelopment.co.uk/2016/11/16/faster-tests-in-php-organising-test-suites.html">Organising Test Suites</a></li>
</ol>
]]></content>
        </entry>
            <entry>
            <title type="html"><![CDATA[Faster Tests in PHP: Avoiding latency with Fakes]]></title>
            <link href="https://davedevelopment.co.uk/2016/11/08/faster-tests-in-php-avoiding-latency-with-fakes.html"/>
            <updated>2016-11-08T00:00:00+00:00</updated>
            <id>https://davedevelopment.co.uk/2016/11/08/faster-tests-in-php-avoiding-latency-with-fakes.html</id>
            <content type="html"><![CDATA[<p>Faster tests get run more often. Fast tests are critical for people practicing
TDD, keeping that feedback loop nice and tight. One of my favourite ways to keep
tests running on time is to minimise the amount of waiting on I/O needed to exercise the system.</p>

<p>There are a handful of ways to do this, it's quite common for people to reach
for their favourite test double tool, and create mocks and stubs for database
connections or SDKs, but I'm not going to talk about that. I use a lot of
mocks, but I don't use them to keep my tests fast. We can use another type of
test double to try and get our tests running faster: Fakes. A Fake is a simpler
or more lightweight version of the real system or component.</p>

<p>All of the methods I'm going to mention make compromises regarding the
thouroughness of your tests. We're going to be changing the way the system
operates to be different from how it will ultimately operate in production, to
make our tests quicker to run. In doing so, we will be
sacrificing the surety of testing the system end to end as it should be run.</p>

<p>There are three reasons why you might use a Fake to replace a component or system:</p>

<ul>
<li>When the system/component is not available</li>
<li>When it makes your tests easier to write or run</li>
<li>When the system/component is slowing you down</li>
</ul>

<p>We're going to concentrate on the slowing down part. These days we all have
super fast computers and networds at our disposal and most of the I/O we run for the basic
web apps is pretty quick, but as your test suite grows, even milliseconds for a
database query or HTTP request that gets run for every test soon adds up.</p>

<h1 id="fake-objects">Fake Objects</h1>

<p>One of the easiest way to avoid latency is to replace an object that uses disks
or the network with an object that offers the same API, but does things a little
differently, hopefully faster. If you are using any modern PHP framework that
provides testing support, you are probably using a few fakes like these already.
If your framework doesn't provide testing support, maybe you should choose
another framework.</p>

<pre><code class="php">use Aws\S3\S3Client;

class S3Storage implements Storage
{
    private $bucket;
    private $client;

    public function __construct(S3Client $client, $bucket)
    {
        $this-&gt;client = $client;
        $this-&gt;bucket = $bucket;
    }

    public function put($targetPath, $contents)
    {
        $this-&gt;client-&gt;putObject(array(
            'Bucket' =&gt; $this-&gt;bucket,
            'Key'    =&gt; $targetPath,
            'Body'   =&gt; $contents,
        ));
    }

    public function get($targetPath)
    {
        $result = $this-&gt;client-&gt;getObject(array(
            'Bucket' =&gt; $this-&gt;bucket,
            'Key'    =&gt; $targetPath,
        ));

        return (string) $result['Body'];
    }
}
</code></pre>

<p>Assuming we're going to need to use this in our tests, we might go ahead and
create a test bucket on S3 to run all our tests against and for the first few
tests, this seems like it's working great.</p>

<p>As we write a few more tests, we start to realise things are running a little
slowly. Even worse, your internet connection becomes intermittent or drops out
entirely. Sure we could refactor some code, ditch those integrated tests and
write more isolated unit tests, avoiding the problem, but that's not always
ideal and definitely isn't the only way to approach the problem.</p>

<h1 id="in-memory-fake-objects">In Memory Fake Objects</h1>

<p>Writing a simple implementation of <code>Storage</code> that stores
the data in memory avoids our network problems (as well as a bunch of CPU cycles
in the S3 SDK), getting our tests nice and snappy again.</p>

<pre><code class="php">class ArrayStorage implements Storage
{
    private $data;

    public function put($targetPath, $contents)
    {
        $this-&gt;data[$targetPath] = $contents;
    }

    public function get($targetPath)
    {
        return $this-&gt;data[$targetPath];
    }
}
</code></pre>

<h1 id="more-persistent-fakes-objects">More persistent Fakes Objects</h1>

<p>One disadvantage to holding data in memory is that it goes away! If we wanted
the data to stick around after a test run (this can be helpful for debugging) or
if we need to have access to the same data across processes,
(maybe you're shelling out or hitting a real webserver), we'll need something
with more persistence.</p>

<p>In the previous example, the real system is somewhere on the internet, so a
local disk based implementation will still incur some latency, but will operate much
faster and more reliably than API calls over the internet.</p>

<pre><code>class DiskStorage implements Storage
{
    private $dir;

    public function __construct($dir)
    {
        $this-&gt;dir = $dir;
    }

    /*
     * No error checking for brevity
     */

    public function put($targetPath, $contents)
    {
        file_put_contents($this-&gt;dir."/".$targetPath, $contents);
    }

    public function get($targetPath)
    {
        return file_get_contents($this-&gt;dir."/".$targetPath);
    }
}

</code></pre>

<p>You'll quickly find Fakes like the one above become useful outside of tests. You
might find you use them in your QA environments, or for your local setup for
development or for demonstrations. It's also quite possible that Fakes developed
for testing end up being good enough to ship as production alternatives
to the real systems.</p>

<h1 id="self-initialising-fake-objects">Self Initialising Fake Objects</h1>

<p>Self initialising fakes are kind of like a stub/fake hybrid. They act like a
stub, in that they return canned results, but they're more like a fake because
they actually proxy to a true implementation, caching the calls indefinitely.
Ruby's <a href="https://github.com/vcr/vcr">vcr</a> is a popular library that does this
at the HTTP level, intercepting calls to <code>Net::HTTP</code> and a bunch of other HTTP
clients, replaying the results on subsequent calls. At any given time, you can
choose to forego the recordings and make the actual underlying HTTP calls. There's a <a href="https://github.com/php-vcr/php-vcr">PHP
port</a> that hooks in to curl, but I'm yet to
try it out. For the purposes of a demonstration, we'll write our own naive
implementation using the decorator pattern to cache calls to an underlying object.</p>

<pre><code class="php">class VCRStorage implements Storage
{
    private $storage;
    private $libraryDir;

    public function __construct(Storage $storage, $libraryDir)
    {
        $this-&gt;storage = $storage;
        $this-&gt;libraryDir = $libraryDir;
    }

    public function put($targetPath, $contents)
    {
        return $this-&gt;call("put", $targetPath, $contents);
    }

    public function get($targetPath)
    {
        return $this-&gt;call("get", $targetPath);
    }

    private function call($method, ...$args)
    {
        $file = rtrim($this-&gt;libraryDir, "/")."/".md5($method."|".implode("|", $args));

        if (file_exists($file)) {
            return file_get_contents($file);
        }

        $contents = $this-&gt;storage-&gt;{$method}(...$args);
        file_put_contents($file, $contents);

        return $contents;
    }
}
</code></pre>

<p>This could get quite complicated depending on how easily the arguments and
return types serialise to disk, but you get the idea. For something as simple as
the example above, I much prefer this solution to intercepting calls via PHP's
autoloader à la php-vcr..</p>

<h1 id="verified-fake-objects">Verified Fake Objects</h1>

<p>Once our Fakes get a little more complicated, we might want to start writing
tests for them to make sure they're behaving like the real thing, particularly
if the system/component is likely to change regularly.</p>

<pre><code class="php">abstract class StorageTest extends \PHPUnit_Framework_TestCase
{
    abstract protected function getStorage(): Storage;

    /**
     * @test
     */
    public function what_goes_in_must_come_out()
    {
        $targetPath = "/some/path";
        $contents = "the contents";
        $storage = $this-&gt;getStorage();

        $storage-&gt;put($targetPath, $contents);

        $this-&gt;assertEquals($contents, $storage-&gt;get($targetPath));
    }
}
</code></pre>

<p>Subclassing this class and providing <code>S3Storage</code>, <code>ArrayStorage</code>, <code>DiskStorage</code>
and <code>VCRStorage</code> instances in the <code>getStorage</code> method enables us to run the same
tests against the different implementations. Adam Wathan has a nice
<a href="https://adamwathan.me/2016/02/01/preventing-api-drift-with-contract-tests/">screencast</a>
on this if you fancy watching how easy he makes it look in just 10 minutes.</p>

<h1 id="fake-systems-with-in-memory-backends">Fake Systems with In Memory Backends</h1>

<p>Sometimes, if something is particularly cross cutting and difficult to isolate,
it can be hard to replace the client code or object with a fake. In this instance, it is
sometimes possible to replace or change the whole system in the backend to speed up our tests.</p>

<p>If you need to shift large amounts of data in your databases, it might be worth
keeping the client code the same, but switching out to a memory based backend.</p>

<p>MySQL comes with a <a href="http://dev.mysql.com/doc/refman/5.7/en/memory-storage-engine.html">memory storage
engine</a>, but
I prefer create a ramdisk and configure MySQL to keep it's data there.</p>

<pre><code>davem@wes:~$ mount  | grep ramdisk
tmpfs on /tmp/ramdisk type tmpfs (rw,nosuid,nodev,relatime,size=1048576k)
</code></pre>

<h1 id="completely-fake-systems">Completely Fake systems</h1>

<p>Another quick win can be to replicate the whole third party system with a local
equivalent.</p>

<p>Can you run your tests against <a href="https://sqlite.org/">SQLite</a> rather than your
full blown RDBMS? SQLite can be given a URL to tell it to store everything in
memory.</p>

<pre><code class="php">$conn = \Doctrine\DBAL\DriverManager::getConnection([
    'url' =&gt; 'sqlite:///:memory:',
] , new \Doctrine\DBAL\Configuration());
</code></pre>

<p>Amazon provides a <a href="http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/DynamoDBLocal.html">downloadable
version</a>
of <a href="https://aws.amazon.com/dynamodb/">DynamoDB</a>, which you can run locally for
your tests avoiding the latency of making calls across internet. There are also
a bunch of compatible implementations of other AWS services to be found on
github, though your mileage may vary.</p>

<h1 id="wrapping-up">Wrapping up</h1>

<p>If something is slowing your test runs down, make it quicker. If you can't
make it quicker, replace it.</p>

<p>This is the first in a series of posts describing how you can go about making test
runs faster, I'll be back to update this post as more posts in the series get
published. If you'd like to be notified, pop your email address in the box
below.</p>

<h1 id="faster-tests-in-php">Faster tests in PHP</h1>

<ol>
<li><a href="http://davedevelopment.co.uk/2016/11/08/faster-tests-in-php-avoiding-latency-with-fakes.html">Avoiding latency with Fakes</a></li>
<li><a href="http://davedevelopment.co.uk/2016/11/16/faster-tests-in-php-organising-test-suites.html">Organising Test Suites</a></li>
</ol>
]]></content>
        </entry>
            <entry>
            <title type="html"><![CDATA[Service Locators have their place]]></title>
            <link href="https://davedevelopment.co.uk/2016/06/01/service-locators-have-their-place.html"/>
            <updated>2016-06-01T00:00:00+00:00</updated>
            <id>https://davedevelopment.co.uk/2016/06/01/service-locators-have-their-place.html</id>
            <content type="html"><![CDATA[<p>I was prompted to write this post after seeing a couple of things pop up in my
timeline.</p>

<div class="twitter-widget">
    <blockquote class="twitter-tweet" data-cards="hidden" data-lang="en"><p
    lang="en" dir="ltr">Service locators don’t belong in controllers - <a
    href="https://t.co/XrtxBTlbtp">https://t.co/XrtxBTlbtp</a></p>&mdash; Brandon
    Savage (@brandonsavage) <a
    href="https://twitter.com/brandonsavage/status/738083153009594373">June 1,
    2016</a></blockquote>
    <script async src="//platform.twitter.com/widgets.js" charset="utf-8"></script>
</div>

<p>Brandon's article carries an overarching message and he states it in his rule of
thumb:</p>

<blockquote>
  <p>Service locators don’t belong inside controllers. Period.</p>
</blockquote>

<p>I couldn't disagree more. Controllers are the number one place I use a service
locator and it seems I'm not the only one, as hinted at by <a
href="https://twitter.com/everzet">Konstantin</a>, also tweeting today:</p>

<div class="twitter-widget">
    <blockquote class="twitter-tweet" data-conversation="none" data-lang="en"><p
    lang="en" dir="ltr">I spent way too much time in the past making controllers
    &quot;clean&quot;. Nowadays they&#39;d re dirty as sin...</p>&mdash; Konstantin
    K (@everzet) <a href="https://twitter.com/everzet/status/737691352746496000">May
    31, 2016</a></blockquote>
    <script async src="//platform.twitter.com/widgets.js" charset="utf-8"></script>
</div>

<p>I used to think like Brandon. I used to try and <a href="/2012/10/03/Silex-Controllers-As-Services.html">keep my controllers
clean</a> too, but like Konstantin,
at some point a few years ago I realised I didn't need to anymore. I was getting
better at putting clean clode in the proper places and as Konstantin puts it, I could
make my controllers as dirty as sin.</p>

<div class="twitter-widget">
    <blockquote class="twitter-tweet" data-lang="en"><p lang="en" dir="ltr">I&#39;ve
    come full circle on controllers. I was isolating them from the framework. Now I
    don&#39;t care, couple them as tightly as I like.</p>&mdash; Dave Marshall
    (@davedevelopment) <a
    href="https://twitter.com/davedevelopment/status/461449597077164032">April 30,
    2014</a></blockquote>
    <script async src="//platform.twitter.com/widgets.js" charset="utf-8"></script>
</div>

<p>In a typical MVC framework, controllers are for converting a HTTP request in to
something suitable to send to your actual application code. The better you get
at extracting business logic or even just complicated HTTP layer logic to places
outside your controllers, the thinner and dumber your controllers get.  Once
they get dumb and thin, it makes sense to leverage the conveniences that a
decent MVC framework provides for you.</p>

<p>If I'm making changes to a controller and it starts to get painful, there's a
good chance I would look to extract some logic out of the controller,
rather than extracting the conveniences afforded me by the framework. This
usually manifests in some sort of Service Layer for things coming in, and
as Presenters for things going out.</p>

<p>Use and abuse your chosen framework, that's what it's there for.</p>
]]></content>
        </entry>
            <entry>
            <title type="html"><![CDATA[Verifying Doubles in PHP]]></title>
            <link href="https://davedevelopment.co.uk/2016/04/20/verifying-doubles-in-php.html"/>
            <updated>2016-04-20T00:00:00+00:00</updated>
            <id>https://davedevelopment.co.uk/2016/04/20/verifying-doubles-in-php.html</id>
            <content type="html"><![CDATA[<p>A common concern that gets raised about using test doubles (mocks, stubs, spies
etc) , is that of the configuration of the test double stubs or expectations
being out of sync with the signatures of the actual type. Probably best
explained with an example.</p>

<pre><code class="php">interface UserRepository { }

class Foo 
{
    /* ... */

    function bar()
    {
        $this-&gt;userRepository-&gt;delete(123);
    }
}

/** @test */
function should_delete_user_123()
{
    $userRepository = Mockery::mock(UserRepository::class); 
    $userRepository-&gt;shouldReceive("delete")-&gt;with(123)-&gt;once();

    $foo = new Foo($userRepository);

    $foo-&gt;bar();
}

</code></pre>

<p>Despite the <code>delete</code> method not existing on the <code>UserRepository</code> interface, this
test will pass. For <a href="https://github.com/padraic/mockery">Mockery</a>, this is by
design. When I'm in my TDD loop, , I'm designing the <code>UserRepository</code> interface
as I develop the <code>Foo</code> service, programming by wishful thinking. In order for
<code>Foo</code> to do <code>bar</code>, I'd like to assume I have a <code>UserRepository::delete</code>
method, but I'll care about adding that later. The problem manifests when we
don't necessarily remember to deal with it later. We should notice the problem
when we run some higher level test, but they don't always exist and even if they
do, we might make the mistake of adding the <code>delete</code> method to the concrete
<code>UserRepository</code> used in that higher level test, rather than the abstract. All
the tests will pass, but things still won't be quite right.</p>

<p>The ruby community came up with a solution to this,
<a href="https://github.com/xaviershay/rspec-fire">rspec-fire</a>, which was subsequently
made obselete in favour of
<a href="https://relishapp.com/rspec/rspec-mocks/v/3-0/docs/verifying-doubles">verifying-doubles</a> in rspec core. This works by allowing you to program by wishful
thinking, until you actually create the class, at which point rspec will check
to make sure the method you are stubbing or expecting actually exists. Rspec
will also check the arity of the stubs and expectations against the real thing.</p>

<p>This sounds great, but kind of annoys me. Just because a class or type exists,
doesn't mean it's API is finalised. I would prefer to continue programming by
wishful thinking within my TDD loop.</p>

<p>So how are we going to deal with this problem in PHP? For methods that don't
exist at all, <a href="https://github.com/phpspec/prophecy">Prophecy</a> will throw
exceptions if you try to set up stubs or expectations.</p>

<pre><code>There was 1 error:

1) ProphecyTest::prophecy_test
Prophecy\Exception\Doubler\MethodNotFoundException: Method `Double\UserRepository\P1::update()` is not defined
</code></pre>

<p>I don't believe PHPUnit mocks has any such feature at present, but it looks like
<a href="https://github.com/sebastianbergmann/phpunit-mock-objects/commit/def4c9df058eaac669a762c000daee0b66e3b163">something is in the
works</a>.</p>

<p><a href="https://github.com/padraic/mockery">Mockery</a> comes with a global configuration
option to prevent stubbing and expecting methods that don't exist yet, that is
disabled by default. You can turn it on in your test bootstrap. I like to turn
it on for test runs outside the TDD loop. These test runs are more looking for
regressions like on a CI server, rather than helping me develop behaviour in the
system, so it seems sensible to verify we aren't doing anything stupid. I get to
ignore warnings during my TDD loop, but get the safety net of having them verified
at a later date.</p>

<pre><code>Mockery::getConfiguration()-&gt;allowMockingNonExistentMethods(false);
</code></pre>

<p>As for the arity of existing methods, I think that's a problem best solved with
the proper use of PHP's type hints.</p>
]]></content>
        </entry>
            <entry>
            <title type="html"><![CDATA[Effective tests: Creating test data with fixture factories]]></title>
            <link href="https://davedevelopment.co.uk/2015/11/11/creating-test-data-with-fixture-factories.html"/>
            <updated>2015-11-11T00:00:00+00:00</updated>
            <id>https://davedevelopment.co.uk/2015/11/11/creating-test-data-with-fixture-factories.html</id>
            <content type="html"><![CDATA[<p>Following from my post on <a href="/2015/10/26/setting-up-a-database-fixture.html">setting up a database
fixture</a> for your test suite,
the next step is adding data to that fixture for your specific tests. The
more specific Arrange part of the <a href="http://c2.com/cgi/wiki?ArrangeActAssert">Arrange, Act,
Assert</a> pattern.</p>

<p>For a long time, I thought the only way to have database records for my tests,
was to manage one large sql dump that contained lots of records, all of which
were required for one or more tests within the test suite, or to use
<a href="http://dbunit.sourceforge.net/">DBUnit</a> with a bunch of XML files. This changed
when I came across <a href="https://github.com/thoughtbot/factory_girl">factory_girl</a> in
some ruby test suites.</p>

<p>There are a bunch of similar packages for php out there
(<a href="https://github.com/thephpleague/factory-muffin">factory-muffin</a> springs to
mind), but I've always tended to roll my own, mostly as I've worked in gnarly
legacy code bases and to be honest, it's not really the most complicated thing
to do. I also keep them as simple as possible and avoid the production
code, so data gets inserted directly in to the database, rather than using an
ORM to store the data. I've previously written about <a href="/2015/01/21/object-mothers.html">Object
Mothers</a> and <a href="/2015/01/28/test-data-builders.html">Test Data
Builders</a>, which I would use alongside the
ORM if I wanted to do things that way.</p>

<p>If you're generating a lot of fake data, you might want
to look in to using something like <a href="https://github.com/fzaninotto/Faker">Faker</a>,
but I've found a handful of simple functions cater for most of my needs.</p>

<p>As a starting point, given a <code>users</code> table with an <code>email</code> and <code>password</code> field,
I'll add the schema to the <code>fixture.sql</code> file as mentioned in the <a href="/2015/10/26/setting-up-a-database-fixture.html">setting up a
database fixture</a> article, then
create a class like this:</p>

<pre><code class="php">&lt;?php

namespace tests\support;

use Doctrine\DBAL\Connection;

class UserFixtureFactory
{
    private $conn;

    public function __construct(Connection $conn)
    {
        $this-&gt;conn = $conn;
    }

    public function create()
    {
        $data = [
            'email' =&gt; "user@example.org",
            "password" =&gt; password_hash("password", PASSWORD_DEFAULT),
        ];

        $this-&gt;conn-&gt;insert('users', $data);
    }
}
</code></pre>

<p>This seems simple enough and is easy enough to get working in one of our tests.</p>

<pre><code class="php">    /** @test */
    public function fixture_factory_works()
    {
        $userFixtureFactory = new UserFixtureFactory($this-&gt;conn());
        $userFixtureFactory-&gt;create();

        $this-&gt;assertEquals(1, $this-&gt;conn()-&gt;fetchColumn("SELECT COUNT(*) FROM users"));
    }
</code></pre>

<p>Like any good <code>users</code> table, our <code>email</code> field has a unique constraint on it, so
we need to work around that:</p>

<pre><code class="php">    /** @test */
    public function fixture_factory_works_with_lots_of_users()
    {
        $userFixtureFactory = new UserFixtureFactory($this-&gt;conn());

        for($i = 0; $i &lt; 10; $i++) {
            $userFixtureFactory-&gt;create();
        }

        $this-&gt;assertEquals(10, $this-&gt;conn()-&gt;fetchColumn("SELECT COUNT(*) FROM users"));
    }
</code></pre>

<p>Adding a simple counter to the method will keep our email addresses unique:</p>

<pre><code class="php">    public function create()
    {
        static $counter = 0;
        $counter++;

        $data = [
            'email' =&gt; "user{$counter}@example.org",
            "password" =&gt; password_hash("password", PASSWORD_DEFAULT),
        ];
</code></pre>

<p>Yay green test! It's kinda slow though, slower than I expected. Probably the
password hashing, let's fix that value with a literal.</p>

<pre><code class="php">        $data = [
            'email' =&gt; "user{$counter}@example.org",
            "password" =&gt; '$2y$10$Fx9LBid2/HV24SseoTp/sulorRnkykwN7D8HbUvsIgPtrDsxBqnUq', # password_hash("password", PASSWORD_DEFAULT),
        ];
</code></pre>

<p>The next thing I want is to allow the caller to override the default data:</p>

<pre><code>    /** @test */
    public function fixture_factory_allows_overriding_defaults()
    {
        $userFixtureFactory = new UserFixtureFactory($this-&gt;conn());

        $userFixtureFactory-&gt;create(['email' =&gt; 'dave@example.org']);

        $this-&gt;assertEquals('dave@example.org', $this-&gt;conn()-&gt;fetchColumn("SELECT email FROM users"));
    }

    public function create(array $data = [])
    {
        static $counter = 0;
        $counter++;

        $data = array_merge([
            'email' =&gt; "user{$counter}@example.org",
            "password" =&gt; '$2y$10$Fx9LBid2/HV24SseoTp/sulorRnkykwN7D8HbUvsIgPtrDsxBqnUq', # password_hash("password", PASSWORD_DEFAULT),
        ], $data);

        $this-&gt;conn-&gt;insert('users', $data);
    }

</code></pre>

<p>Finally, I want the factory to return the data it used, so that the test
code can make use of it as necessary:</p>

<pre><code class="php">    /** @test */
    public function fixture_factory_returns_data()
    {
        $userFixtureFactory = new UserFixtureFactory($this-&gt;conn());

        $id = $userFixtureFactory-&gt;create()['id'];

        $this-&gt;assertEquals($id, $this-&gt;conn()-&gt;fetchColumn("SELECT id FROM users"));
    }

    public function create(array $data = [])
    {
        static $counter = 0;
        $counter++;

        $data = array_merge([
            'email' =&gt; "user{$counter}@example.org",
            "password" =&gt; '$2y$10$Fx9LBid2/HV24SseoTp/sulorRnkykwN7D8HbUvsIgPtrDsxBqnUq', # password_hash("password", PASSWORD_DEFAULT),
        ], $data);

        $this-&gt;conn-&gt;insert('users', $data);

        $data['id'] = $this-&gt;conn-&gt;lastInsertId();

        return $data;
    }
</code></pre>

<p>That's pretty much it.</p>

<p>All the test examples so far have created the fixture factory when required, I
don't recommend doing this and would probably create a helper method as a trait
or on a base class.</p>

<pre><code class="php">    /** @test */
    public function fixture_factory_works()
    {
        $this-&gt;hasAUser();

        $this-&gt;assertEquals(1, $this-&gt;conn()-&gt;fetchColumn("SELECT COUNT(*) FROM users"));
    }

    public function hasAUser(array $data = [])
    {
        $userFixtureFactory = new UserFixtureFactory($this-&gt;conn());

        return $userFixtureFactory-&gt;create($data);
    }
</code></pre>

<p>Things tend to get more complicated than this, particularly when your factories
need to be aware of other factories, in order to create and maintain
relationships. I'll cover how I tackle that in another article, but needless to
say, it's not much different from managing dependencies in your production
code.</p>

<p>Happy testing!</p>
]]></content>
        </entry>
            <entry>
            <title type="html"><![CDATA[Effective tests: Setting up a database fixture]]></title>
            <link href="https://davedevelopment.co.uk/2015/10/26/setting-up-a-database-fixture.html"/>
            <updated>2015-10-26T00:00:00+00:00</updated>
            <id>https://davedevelopment.co.uk/2015/10/26/setting-up-a-database-fixture.html</id>
            <content type="html"><![CDATA[<p>For most of us in the PHP community, writing our first integrated test usually
means interacting with a database. For too long I considered this a difficult
and frustrating thing to do, so I avoided it, leaving code either uncovered, or
covered with overly specified tests using way too many test doubles. Most of the
modern frameworks do this kind of thing for you, but here's how I do it.</p>

<p>It's worth noting that I tend to work on products, rather than software
products. My software doesn't get distributed, I don't have to support multiple
database vendors etc.</p>

<h1 id="the-stages-of-the-test">The stages of the test</h1>

<p>When we think about how a database is involved in a test, we can think of the
usual Arrange, Act, Assert steps, but with set up and tear down either side of
them.</p>

<ol>
<li>Set the database up to a known state for all tests</li>
<li>Arrange the database records for this specific test</li>
<li>Act, exercising the system under test</li>
<li>Assert against the state of the database</li>
<li>Tear the database back down to it's known state for the next test</li>
</ol>

<p>This article is purely going to focus on the first and last stages, setting the
database up and tearing it down. I have a test support class that manages this
for me, it takes a Doctrine DBAL connection, which is already configured with
access to a test database server.</p>

<pre><code>&lt;?php

namespace tests\support\Database;

use Doctrine\DBAL\Connection;

class Fixture
{
    private $conn;

    public function __construct(Connection $conn)
    {
        $this-&gt;conn = $conn;
    }

    public function setup() { }
}
</code></pre>

<h1 id="set-up">Set up</h1>

<p>Given we have a database server available to us, the set up stage is going to
include getting our schema loaded. While it's tempting to run your database
migrations to set the schema up, I feel like this is a waste of time. I rarely
write database migrations and once they've hit all development, staging and
production environments, they're not really useful to me any more and don't need
to be tested. I like to keep a copy of the current schema alongside the tests in
version control, <code>tests/support/Database/fixture.sql</code>. I then import this fixture during
the set up stage.</p>

<pre><code class="php">    public function setup()
    {
        $this-&gt;load(); 
    }

    private function load()
    {
        $file = $file ?: __DIR__ . '/fixture.sql';
        $params = $this-&gt;conn-&gt;getParams();
        system("MYSQL_PWD={$params['password']} mysql -h{$params['host']} -P {$params['port']} -u{$params['user']} {$params['dbname']} &lt; $file");
    }
</code></pre>

<p>I shell out to the mysql command line client as I've found it to be a shade
faster, but your mileage will vary and you could try going through the DBAL
instance as well.</p>

<p>In order to keep the fixture up to date when creating migrations, I have a short
script that will load the fixture, run the migrations and then dump the database
back in to that fixture file, ready to be committed to version control alongside
the migration.</p>

<pre><code class="php">    public function update()
    {
        $this-&gt;load();

        # It's not quite like this, but you get the idea
        system("APPLICATION_ENV=testing php atstisan migrate");

        $this-&gt;dump();
    }

    public function dump()
    {
        $file = $file ?: __DIR__ . '/fixture.sql';
        $params = $this-&gt;conn-&gt;getParams();

        system("MYSQL_PWD={$params['password']} mysqldump --set-gtid-purged=OFF -h{$params['host']} -P {$params['port']} -u{$params['user']} --opt {$params['dbname']} &gt; $file");
    }
</code></pre>

<p>If I have the need for some standard data to be available to every test, I don't
bother writing code to seed the database at the start of every test. I load the
fixture, do whatever it takes to get the data in there, be it scripting or by
hand, and then dump the fixture again, ready to be stored in version control.
This keeps my start up time quick and deterministic, what's in the <code>fixture.sql</code>
file is what every test starts with.</p>

<h1 id="tear-down">Tear down</h1>

<p>You could tear the database down by deleting the entire fixture, allowing the
setup method to reload the entire fixture again, but this can be very slow. To
speed things up, we can load the fixture once at the start of the test suite, and
then have every test tear the database back down to this initial state.</p>

<pre><code class="php">    public function setup()
    {
        if ($this-&gt;fixtureLoaded) {
            return;
        }

        $this-&gt;load(); 
        $this-&gt;fixtureLoaded = true;
    }
</code></pre>

<p>A popular way of tearing a database down is to run each test in a transaction
and then roll back the transaction after the test has completed. This works quite
well and is really fast, but has a couple of drawbacks.</p>

<p>The first is that all database operations need to run through that
same connection that holds the transaction. This isn't so easy when you're
wanting to do headless browser tests with tools like selenium, or if you need to
do any out of bounds processing like queue workers. To get around this, you can use a
truncation strategy. The tear down stage truncates all the necessary tables to
return the database to known state (except for auto increment counts, but I can
live with that).</p>

<pre><code class="php">    public function tearDown()
    {
        $this-&gt;truncate();
    }

    public function truncate()
    {
        foreach ($this-&gt;tablesToTruncate as $table) {
            $this-&gt;conn-&gt;delete($table, array(1 =&gt; 1));
        }
    }
</code></pre>

<p>Again, here I use the equivalent of <code>DELETE FROM $table WHERE 1=1</code>, which I've
found to be a hair quicker than <code>TRUNCATE $table</code>, but you should benchmark for
yourself. I manually keep a list of tables to truncate, but you could easily
make a list of tables not to truncate, or if you don't have any data in the
<code>fixture.sql</code> file, truncate all tables.</p>

<p>The second drawback to using transactions and the most important for me, is that
the state of the database is always torn back down at the end of the test. If
you're trying to debug a particular problem, it can be very useful to be able to
examine the database after the test has finished. Surely this is the same using
a truncation strategy, I hear you say. You would be right, but with truncation,
we can move the truncation to the setup method, not do anything on tear down and
don't call me Shirley.</p>

<pre><code class="php">    public function setup()
    {
        if ($this-&gt;fixtureLoaded) {
            $this-&gt;truncate();

            return;
        }

        $this-&gt;load(); 
        $this-&gt;fixtureLoaded = true;
    }

    public function tearDown() {}
</code></pre>

<p>This way, each class gets a clean state, but after a test has run, you get the
chance to inspect the state of the database. It's pretty fast too, at least fast
enough for me.</p>

<h1 id="usage">Usage</h1>

<p>I use this code in a setup method of a base class in my day job, but you can also do the same
thing with a trait. This example uses a singleton to get hold of a <code>Fixture</code>
instance, but I'll write about other ways of doing that in another article.</p>

<pre><code class="php">&lt;?php

namespace tests\support;

use tests\support\Database\Fixture;

trait UsesDatabase
{
    /**
     * @before
     */
    public function setupDatabase()
    {
        Fixture::instance()-&gt;setup();
    }
}
</code></pre>

<pre><code class="php">&lt;?php

namespace tests\integration;

use tests\support\UsesDatabase;

class SomeTest extends \PHPUnit_Framework_TestCase
{
    use UsesDatabase;

    // ...
}

</code></pre>

<p>Happy testing!</p>
]]></content>
        </entry>
    </feed>