On July 3, 2011 christopher wrote: A Small Exercise in Refactoring Data Access Objects

Deal­ing with “legacy” code is one of the prob­lems I face most often on a day-to-day basis. Whilst it would be lovely to always be in a posi­tion to write clean, fresh code it is far more com­mon that one will be in a posi­tion of hav­ing to adapt and mod­ify pre-existing code that is far from ideal (which is often why it needs to be mod­i­fied). Such code will com­monly have no tests or any tests which do exist fail to test the code in any mean­ing­ful sense. Mod­i­fy­ing such code can be fraught with peril: in the absence of good tests, the changes one makes can have more or less sub­tle repur­cus­sions that can rip­ple across a sys­tem, fre­quently with adverse, unan­tic­i­pated func­tional side-effects. More­over, in order to imple­ment new or mod­i­fied func­tion­al­ity within such code util­is­ing test-first method­olo­gies, it is com­monly first nec­es­sary to change the code in order to make it “testable”. Hence, at unit-test level, we have a kind of “Catch 22″ sit­u­a­tion: I want to test the code in order to change it but first need to change it in order to test it. How do I ensure that my changes for the pur­poses of testa­bil­ity do not have a func­tional impact when those changes can­not them­selves be tested? My answer is usu­ally to “go up a level” and iso­late the legacy code within an inte­gra­tion test har­ness. For the exam­ple of data access objects with which I am con­cerned here, this means actu­ally going to the data­base. The ratio­nale for this is sim­ple: an inte­gra­tion test should prove the con­tracts of a class or method under run­time con­di­tions. Inte­gra­tion tests may some­times be slow but the con­fi­dence they pro­vide to change the under­ly­ing code is essen­tial in this case. More­over, they serve to expose and doc­u­ment con­tracts which hereto­fore had prob­a­bly never even been given con­sid­er­a­tion and this is a nec­es­sary step.

To begin with, one should be clear about the def­i­n­i­tions of “legacy” and “refac­tor­ing” here:

Legacy
In line with many authors on TDD, such as Steve Free­man, I would ini­tially define “legacy” as being sim­ply “code with­out tests” (or code with mean­ing­less tests, which is the same thing). Addi­tion­ally, I would add the archi­tec­tural def­i­n­i­tion that “legacy” code is that which utilises patterns/supporting tech­nolo­gies etc that are explic­itly “dep­re­cated” within the over­all archi­tec­tural plan for a sys­tem (you do have such a plan, don’t you?) — it has been iden­ti­fied and there is a strat­egy in place for its removal.
Refac­tor­ing
Con­se­quently, refac­tor­ing can be defined as chang­ing the under­ly­ing imple­men­ta­tion for code that already passes good tests such that it is brought more fully into line with your over­all archi­tec­tural plan. If you change code with­out a test (I’m not so dog­matic as to say you should never do this, it is just that you need a very good rea­son), or before it is pass­ing the test (a mis­take when deal­ing with pre-existing code / sim­ply the process of imple­men­ta­tion for new code), then it is not refactoring.

Refac­tor­ing can be a con­tentious topic from a busi­ness per­spec­tive because, by def­i­n­i­tion, it should have no func­tional impact. The eco­nomic jus­ti­fi­ca­tions are:

  1. Per­for­mance: the code works bet­ter under given run­time conditions.
  2. Main­tain­abil­ity: the code can be more quickly and eas­iliy mod­i­fied under con­di­tions of change to exist­ing functionality.
  3. Exten­si­bil­ity: the code can be more quickly and eas­ily mod­i­fied given require­ments for addi­tional functionality.

Note that “bring­ing into line with over­all archi­tec­tural vision” is not a jus­ti­fi­ca­tion in-itself as this should just be a catch-all way of express­ing the above points. (If not, what is your archi­tec­ture achiev­ing exactly?)

Any­way, mov­ing on, I thought I would share with you a fairly straight­for­ward case I had to deal with recently (note: names have been changed to pro­tect the not-so-innocent!) I was given a story card describ­ing some desired func­tional change from a use case per­spec­tive. Super­fi­cially, the change was sim­ple, requir­ing only a change to some data “lookups” which were used to gen­er­ate HTML select boxes for a user inter­face. There­fore, begin­ning with the rel­e­vant con­troller classes for the user inter­faces that required the change, I was even­tu­ally con­fronted with the fol­low­ing class which is, to all intents and pur­poses, a data access object. It was being instan­ti­ated directly within a “view helper” class. Natch.

public class LegacyDAO extends AbstractLegacyDAO {

    private static final String QUERY_TBL3 = "Select tbl3_id, tbl3_name from tbl3 where tbl3_boolean = 1 order by tbl3_name";

    private static final String QUERY_WITH_JOIN =
        "select distinct  tbl1_id, " +
        "   tbl1_name,tbl1_boolean,tbl2_string " +
        "from " + "   tbl1, " + "   tbl2 " +
        "where " +
        "   tbl1_join_key = tbl2_join_key and " +
        "   (tbl2_some_id = 9005 or " +
        "       (tbl2_some_id = ? and tbl2_string in ('FOO','BAR'))) and " +
        "   tbl2_Start_date < sysdate and " +
        "   tbl2_end_date > sysdate " +
        "order by" + "   tbl1_name";

    private List rowsFromTbl3;

    private List hardCodedData;

    private List rowsFromTbls1And2;

    private List subsetOfRowsFromTbls1And2;

    public List listSubsetOfRowsFromTbls1And2(int someId) {
        try {
            if (subsetOfRowsFromTbls1And2 == null) {
                List allRows = listRowsFromTbls1And2(someId);
                Iterator it = allRows.iterator();
                while (it.hasNext()) {
                    KeyNameObject row = (KeyNameObject) it.next();
                    if (!row.getId().toUpperCase().endsWith("_SOMETHING")) {
                        if (subsetOfRowsFromTbls1And2 == null) {
                            subsetOfRowsFromTbls1And2 = new ArrayList();
                        }
                        subsetOfRowsFromTbls1And2.add(row);
                    }
                }
            }

        } catch (Throwable e) {
            throw new FullTraceException(e);
        }
        return subsetOfRowsFromTbls1And2;
    }

    public List listRowsFromTbls1And2(int someId) {

        rowsFromTbls1And2 = new ArrayList();
        Connection con = null;
        PreparedStatement ps = null;
        ResultSet rs = null;
        try {
            con = getDataSource().getConnection();
            ps = con.prepareStatement(QUERY_WITH_JOIN);
            ps.setInt(1,someId);
            rs = ps.executeQuery();
            while (rs.next()) {
                int theId = rs.getInt("tbl1_id");

                String name = rs.getString("tbl1_name");
                String string = rs.getString("tbl2_string");
                boolean bool = rs.getInt("tbl1_boolean") == 1;

                if (string.toUpperCase().startsWith("FOO1")) {
                    string = "FOO1 FULL NAME";
                } else if (string.equalsIgnoreCase("FOO2")) {
                    string = "FOO2 FULL NAME";
                } else if (string.equalsIgnoreCase("FOO3")) {
                    string = "FOO3 FULL NAME";
                } else if (string.equalsIgnoreCase("FOO4")) {
                    string = "FOO4 FULL NAME";
                } else if (string.equalsIgnoreCase("FOO5")) {
                    string = "FOO5 FULL NAME";
                } else {
                    string = null;
                }

                KeyNameObject row = new KeyNameObject();
                String id = (new Integer(theId)).toString();
                if (string != null) {
                    id = id + "_" + string;
                } else {
                    id = id + "_DEFAULT FOO";
                }
                row.setId(id);
                if (name.toLowerCase().startsWith("the")) {
                    String str = name.substring(0, "the".length());
                    name = name.substring("the".length() + 1, name.length()) + ", " + str;
                }

                if (string != null) {
                    name = name + " (" + string + ")";
                } else {
                    name = name + " (DEFAULT FOO)";
                }
                row.setName(name);
                boolean added = false;
                for (int i = 0; i < rowsFromTbls1And2.size(); i++) {
                    KeyNameObject obj = (KeyNameObject) rowsFromTbls1And2.get(i);
                    if (row.getName().toUpperCase().compareTo(obj.getName().toUpperCase()) < 0) {
                          rowsFromTbls1And2.add(i, row);
                          added = true;
                          break;
                    }
                }

                if (!added) {
                    rowsFromTbls1And2.add(row);
                }
            }
        } catch (Throwable e) {
            throw new FullTraceException(e);
        } finally {
            try {
                rs.close();
            } catch (Throwable ignore) {

            }
            rs = null;
            try {
                ps.close();
            } catch (Throwable ignore) {

            }
            ps = null;
            try {
                con.close();
            } catch (Throwable ignore) {

            }
            con = null;

        }
        return rowsFromTbls1And2;
    }

    public List listRowsFromTbl3() {
        if (rowsFromTbl3 != null) {
            return rowsFromTbl3;
        }
        rowsFromTbl3 = new ArrayList();
        Connection con = null;
        PreparedStatement ps = null;
        ResultSet rs =   null;
        try {
            con = getDataSource().getConnection();
            ps = con.prepareStatement(QUERY_TBL3);
            rs = ps.executeQuery();
            while (rs.next()) {
                int id = rs.getInt("tbl3_id");
                String name = rs.getString("tbl3_name");
                KeyNameObject obj = new KeyNameObject();
                obj.setId(String.valueOf(id));
                obj.setName(name);
                rowsFromTbl3.add(obj);
            }
        } catch (Throwable e) {
            throw new FullTraceException(e);
        } finally {
            try {
                rs.close();
            } catch (Throwable ignore) {

            }
            rs = null;
            try {
                ps.close();
            } catch (Throwable ignore) {

            }
            ps = null;
            try {
                con.close();
            } catch (Throwable ignore) {

            }
            con = null;

        }
        return rowsFromTbl3;
    }

    public List listHardCodedData() {
        if (hardCodedData != null) {
            return hardCodedData;
        }
        hardCodedData = new ArrayList();
        KeyNameObject role1 = new KeyNameObject();
        role1.setName("NAME1");
        hardCodedData.add(role1);

        KeyNameObject role2 = new KeyNameObject();
        role1.setName("NAME2");
        hardCodedData.add(role2);
        return hardCodedData;
    }

    @Override
    protected String getPrimaryKeyQuery() {
        return null;
    }

    @Override
    protected Class getObjectClass() {
        return LegacyDAO.class;
    }

}

On see­ing this, nat­u­rally, I groaned and the class was lit up like a Christ­mas tree by all the com­piler warn­ings. I looked for the tests: of course, there were none. So I knuck­led down and decided what to do about it. In line with my usual approach to such things, the broad strat­egy went like this:

  1. Read the code closely! We know what it is doing, because we know the user inter­face out­put, but how is it doing it? It has been in pro­duc­tion for a long time, so we can safely assume that it must basi­cally work, even if there are some known or currently-unknown “glitches”.
  2. Find all the things that are wrong about this class, both in sim­ple terms (i.e out­right bugs) and given the cur­rent archi­tec­tural plan. Try to deter­mine which parts of the code had sim­ply become redun­dant since they were writ­ten (i.e. iden­tify straight­for­ward deletes).
  3. Deter­mine basic plan for cor­rect­ing the defi­cien­cies and come up with a back-of-the-envelope esti­mate for how long each would take to complete.
  4. Given the bud­get and time con­straints of the busi­ness change request dri­ving the work, deter­mine which of the changes would be appro­pri­ate within the scope of the cur­rent story card. Those that are too large can be marked for later com­ple­tion either as inde­pen­dent tech­ni­cal debt story cards or sub­se­quent func­tional change requests.

My notes ran as follows:

  • Gen­er­ally speak­ing, all each of the meth­ods does is return query results mapped to a sim­ple, generic KeyNameObject class. It also hard-codes some manip­u­la­tions of “name” val­ues in order to pro­duce the desired view out­put. The fact that it requires 200+ lines to achieve this is out­ra­geous enough. Ide­ally, the desired view out­put names would be per­sis­tent val­ues them­selves and not be gen­er­ated on read (this kind of “on read” manip­u­la­tion is a fail­ing I see very reg­u­larly: nor­malise your data on write, peo­ple!!)
  • It extends a dep­re­cated sup­port class (which I am here call­ing AbstractLegacyDAO); a large class of which this class only uses the getDataSource method. Con­se­quently, the class is com­pelled to “imple­ment” 2 abstract meth­ods from this class (getPrimaryKeyQuery and getObjectClass), which it does incor­rectly. We could, there­fore, fac­tor out this inher­i­tance with ease, erad­i­cat­ing these point­less pseudo-implementations in the process. (Note: when you end up with point­less imple­men­ta­tions like this, it is a clear sign that your inher­i­tance is wrong, wrong, wrong.)
  • The data acces­sor meth­ods lazily-initialise the results and “cache” them as instance mem­bers. The class is con­se­quently not thread-safe. I then checked the num­ber of times this class is used and these meth­ods are called. Unsur­pris­ingly, it turned out to be pre­cisely 1. Hence the lazy-initialisation is a totally unnec­es­sary “opti­mi­sa­tion” that only increases the mem­ory foot­print of the class and increases complexity.
  • No gener­ics are used, since author­ship of the class pre­dated adop­tion of Java 1.5. At least strength­en­ing of type safety in the API would be an easy win.
  • Most of the class con­sists of boil­er­plate JDBC. More­over, the data source is obtained (in the abstract super­class) via a sta­tic lookup. This could make test­ing dif­fi­cult, except that I knew that test­ing of these sta­tic lookups had already been addressed as part of a move to Spring done some time back … at least, there­fore, test har­nesses were in-place already to deal with this. The adop­tion of Spring since this class was writ­ten meant that we could now also eas­ily erad­i­cate the boil­er­plate JDBC. We could, poten­tially, also make this DAO a Spring bean but, since nei­ther the view helper (part of a Struts 1 app) nor the action had yet been Spring-ified, this would require some fairly exten­sive changes which would increase the time required for the work and its and change-impact.
  • The on-read “name manip­u­la­tions” were effec­tively describ­ing a small enu­mer­a­tion. Com­plex­ity of the code could be reduced and read­abil­ity improved by actu­ally using one (i.e. an enum)!
  • The “hard-coded data” method was patently absurd and did not even suc­ceed in its own lim­ited terms, con­tain­ing an obvi­ous copy & paste bug. For­tu­nately, on fur­ther inves­ti­ga­tion, it turned out to be com­pletely unused. I silently hoped that had always been the case.
  • The join query was sub-optimal and a quick explain plan showed that using a left join over a sub­s­e­lect would improve the per­for­mance of the query in this case.
  • Both the queries and the code are full of magic num­bers and strings. There is gen­eral plan in place to deal with this “facet” of the appli­ca­tion, how­ever resolv­ing these issues in this case would be beyond the scope of the work as it would require changes in user inter­ac­tion pat­terns amongst a num­ber of other things, requir­ing the buy-in of the stakeholder.
  • Why do peo­ple insist on writ­ing their code in such an illeg­i­ble man­ner (e.g. those sta­tic strings con­tain­ing the SQL queries, with exces­sive con­ca­ten­ta­tion make the queries almost unread­able [as well as being point­less, as they are only ref­er­enced once]) when, I would argue, it is as easy, of not eas­ier, to achieve a con­sis­tent style. Also, at what point do you think the orig­i­nal author of this class might have stopped and begun to think that maybe, just maybe, 90 lines was a tad exces­sive for a sim­ple data acces­sor method?! Clearly, s/he had never heard of “refac­tor into method”. A reg­u­lar pat­tern of short, sim­ple meth­ods not only reduces com­plex­ity but also improves read­abil­ity by virtue of the fact that humans are extremely adept at pat­tern recog­ni­tion. What is often referred to as “beauty” (and too often dis­missed by tech­ni­cians as “mere aes­thet­ics”) can, in fact, be described in such terms. The aes­thet­ics of your code is not merely aes­thetic: it is functional!

So, armed with this knowl­edge of the code, it was time to set about writ­ing a good test that would ensure that, before I even con­tem­plated intro­duc­ing the requested func­tional changes (let alone start­ing refac­tor­ing), I could be sure that only (a) desired effects and not (b) unde­sired side-effects were the out­come of the work.

The process of writ­ing such tests for “legacy” code can seem a lit­tle back-to-front because, rather than the usual “write test and change code till it passes” we instead “write test and keep chang­ing it till it passes”. In other words: the code itself is the accep­tance cri­te­ria of the test. We can­not change the code until our test passes. This often involves a process of “copy­ing” or inter­pret­ing the exist­ing imple­men­ta­tion in test form. My test looked some­thing like this:

public class LegacyDAOTest {

    private LegacyDAO dao;

    private JdbcTemplate db;

    @Mock private BeanContainer container;

    @Before public void setup() throws Exception {
        initMocks(this);
        Properties p = new Properties();
        p.load(getClass().getResourceAsStream("/jdbc-test.properties"));
        SimpleDriverDataSource ds = new SimpleDriverDataSource((Driver) Class.forName(p.getProperty("jdbc.driver")).newInstance(), p.getProperty("jdbc.url"), p.getProperty("jdbc.username"), p.getProperty("jdbc.password"));
        when(container.get(DataSource.class)).thenReturn(ds);
        StaticBeanFactory.setContainer(container);
        db = new JdbcTemplate(ds);
        dao = new LegacyDAO();
    }

    @After public void teardown() throws Exception {
        StaticBeanFactory.setContainer(new DefaultContainer());
    }

    @Test public void shouldReturnSubsetOfRowsFromTbls1And2() throws Exception {
        List<Integer> journals = db.queryForList("select some_id from some_table", Integer.class);
        for (Integer id : journals) {
            List<KeyNameObject> expected = getExpectedSubsetOfRowsFromTbls1And2(id);
            List<KeyNameObject> actual = dao.listSubsetOfRowsFromTbls1And2(id);
            assertThat(actual, equalTo(expected));
        }
    }

    @Test public void shouldReturnRowsFromTbls1And2() throws Exception {
        List<Integer> journals = db.queryForList("select some_id from some_table", Integer.class);
        for (Integer id : journals) {
            List<KeyNameObject> expected = getExpectedRowsFromTbls1And2(id);
            List<KeyNameObject> actual = dao.listRowsFromTbls1And2(id);
            assertThat(actual, equalTo(expected));
        }
    }

    @Test public void shouldReturnRowsFromTbl3() throws Exception {
        List<KeyNameObject> expected = db.query("select tbl3_id, tbl3_name from tbl3 where tbl3_boolean = 1 order by tbl3_name", new RowMapper<KeyNameObject>() {

            @Override
            public KeyNameObject mapRow(ResultSet rs, int rowNum) throws SQLException {
                KeyNameObject kno = new KeyNameObject();
                kno.setId(rs.getString("tbl3_id"));
                kno.setName(rs.getString("tbl3_name"));
                return kno;
            }

        });
        List<KeyNameObject> actual = dao.listRowsFromTbl3();
        assertThat(actual, equalTo(expected));
    }

    private List<KeyNameObject> getExpectedRowsFromTbls1And2(Integer journalId) {
        List<KeyNameObject> expected = db
                .query("select distinct tbl1_id, tbl1_name, tbl2_string from tbl1 join tbl2 on tbl2_join_key = tbl1_join_key where (tbl2_some_id = 9005 or (tbl2_some_id = ? and tbl2_string in ('FOO','BAR'))) and tbl2_start_date < sysdate and tbl2_end_date > sysdate order by tbl1_name",
                        new RowMapper<KeyNameObject>() {

                            @Override
                            public KeyNameObject mapRow(ResultSet rs, int rowNum) throws SQLException {
                                KeyNameObject o = new KeyNameObject();
                                String string = getMembershipType(rs.getString("tbl2_string"));
                                String id = rs.getString("tbl1_id");
                                String name = rs.getString("tbl1_name");
                                if (name.toLowerCase().startsWith("the")) {
                                    String str = name.substring(0, "the".length());
                                    name = name.substring("the".length() + 1, name.length()) + ", " + str;
                                }
                                o.setId(id + "_" + string);
                                o.setName(name + " (" + string + ")");
                                return o;
                            }

                            private String getMembershipType(String membershiptype) {
                                if (membershiptype.toUpperCase().startsWith("FOO1")) {
                                    return "FOO1 FULL NAME";
                                } else if (membershiptype.equalsIgnoreCase("FOO2")) {
                                    return "FOO2 FULL NAME";
                                } else if (membershiptype.equalsIgnoreCase("FOO3")) {
                                    return "FOO3 FULL NAME";
                                } else if (membershiptype.equalsIgnoreCase("FOO4")) {
                                    return "FOO4 FULL NAME";
                                } else if (membershiptype.equalsIgnoreCase("FOO5")) {
                                    return "FOO5 FULL NAME";
                                } else {
                                    return "DEFAULT FOO";
                                }
                            }

                        }, journalId);
        Collections.sort(expected, new Comparator<KeyNameObject>() {

            @Override
            public int compare(KeyNameObject o1, KeyNameObject o2) {
                return o1.getName().toUpperCase().compareTo(o2.getName().toUpperCase());
            }

        });
        return expected;
    }

    private List<KeyNameObject> getExpectedSubsetOfRowsFromTbls1And2(Integer journalId) {
        List<KeyNameObject> expected = db
                .query("select distinct tbl1_id, tbl1_name, tbl2_string from tbl1 join tbl2 on tbl2_subscriber_id = tbl1_subscriber_id where (tbl2_some_id = 9005 or (tbl2_some_id = ? and tbl2_string in ('FOO','BAR'))) and tbl2_start_date < sysdate and tbl2_end_date > sysdate and not regexp_like(tbl2_string, '^.*_SOMETHING$') order by tbl1_name",
                        new RowMapper<KeyNameObject>() {

                            @Override
                            public KeyNameObject mapRow(ResultSet rs, int rowNum) throws SQLException {
                                KeyNameObject o = new KeyNameObject();
                                String string = getMembershipType(rs.getString("tbl2_string"));
                                String id = rs.getString("tbl1_id");
                                String name = rs.getString("tbl1_name");
                                if (name.toLowerCase().startsWith("the")) {
                                    String str = name.substring(0, "the".length());
                                    name = name.substring("the".length() + 1, name.length()) + ", " + str;
                                }
                                o.setId(id + "_" + string);
                                o.setName(name + " (" + string + ")");
                                return o;
                            }

                            private String getMembershipType(String membershiptype) {
                                if (membershiptype.equalsIgnoreCase("FOO2")) {
                                    return "FOO2 FULL NAME";
                                } else if (membershiptype.equalsIgnoreCase("FOO3")) {
                                    return "FOO3 FULL NAME";
                                } else if (membershiptype.equalsIgnoreCase("FOO4")) {
                                    return "FOO4 FULL NAME";
                                } else if (membershiptype.equalsIgnoreCase("FOO5")) {
                                    return "FOO5 FULL NAME";
                                } else {
                                    return "DEFAULT FOO";
                                }
                            }

                        }, journalId);
        Collections.sort(expected, new Comparator<KeyNameObject>() {

            @Override
            public int compare(KeyNameObject o1, KeyNameObject o2) {
                return o1.getName().toUpperCase().compareTo(o2.getName().toUpperCase());
            }

        });
        return expected;
    }

}

As you can see, my test here repli­cates the exist­ing imple­men­ta­tion in most respects. I don’t try to do any­thing espe­cially clever: the pri­or­ity is to get some test code that accu­rately repro­duces the out­put of the exist­ing class as quickly as pos­si­ble. The pri­mary dif­fer­ences are:

  • I use the improved query using a join in my test code and, for the method which obtains a sub­set of rows, I have a sep­a­rate query that includes the required fil­ter­ing clause in the query, so that I do not have to iter­ate over the result set twice in order to achieve this.
  • I use exist­ing test har­ness tech­niques (a mock Container, using the won­der­ful Mock­ito) for the StaticBeanFactory … because it is sta­tic, I need to restore the default con­tainer in a tear­down to ensure that I do not inter­fere with any sub­se­quent tests.
  • I use JdbcTemplate and extract the row map­ping logic into RowMappers (which were not avail­able when the code was writ­ten, prior to the adop­tion of Spring — although I’m sure it would have been pos­si­ble to con­ceive of one’s own alternative!)
  • To sort the results (which is done on the basis of the names as trans­formed in the code, so can­not be done in the query), I use a Comparator imple­men­ta­tion which I pass to Collections.sort(Collection, Comparator). This option was avail­able when the code was writ­ten and is infi­nitely prefer­able to the frankly painful sight of watch­ing some poor fool iter­at­ing over the results again and again try­ing to do this manually!

It is worth not­ing also that, in this test, the are effec­tively mul­ti­ple asser­tions per-test because, for the sake of iron-clad con­fi­dence in the changes, I go through the entire table, test­ing against each avail­able row iden­ti­fied by some_id (as I am call­ing it in my obfus­ca­tion). This, of course, makes the test take quite a long time to run. Before com­mit­ting the test to the build, I spent a lit­tle time select­ing a small num­ber of crit­i­cal cases and tested only those IDs because oth­er­wise the build would have ground to a halt.

Given that this test passes (N.B. please excuse any errors I may have intro­duced whilst try­ing to obfus­cate to pro­tect the inno­cent — the real ver­sion works fine, I promise!), I then set about incre­men­tally chang­ing the imple­men­ta­tions of the meth­ods that I was going to retain through a series of IDE refac­tor­ings, sup­ported by the intro­duc­tion of a num­ber of embed­ded classes aimed at remov­ing dupli­ca­tion and reduc­ing complexity:

@SuppressWarnings("synthetic-access") public class LegacyDAORefactored {

    private static class CaseInsensitiveNameComaparator implements Comparator<KeyNameObject> {

        @Override
        public int compare(KeyNameObject o1, KeyNameObject o2) {
            return o1.getName().toUpperCase().compareTo(o2.getName().toUpperCase());
        }
    }

    private static enum MatchType {
        TO_UPPER_CASE_STARTS_WITH {

            @Override public boolean matches(String a, String b) {
                return b != null && b.toUpperCase().startsWith(a);
            }

        },
        EQUALS_IGNORE_CASE {

            @Override public boolean matches(String a, String b) {
                return a.equalsIgnoreCase(b);
            }

        };

        public abstract boolean matches(String a, String b);
    }

    private static enum Foo {
        FOO1_FULL_NAME("FOO1_DB_VALUE", MatchType.TO_UPPER_CASE_STARTS_WITH),
        FOO2_FULL_NAME("FOO2", MatchType.EQUALS_IGNORE_CASE),
        FOO3_FULL_NAME("FOO3", MatchType.EQUALS_IGNORE_CASE),
        FOO4_FULL_NAME("FOO4", MatchType.EQUALS_IGNORE_CASE),
        FOO5_FULL_NAME("FOO5", MatchType.EQUALS_IGNORE_CASE),
        DEFAULT_FOO(null, null);

        static Foo forString(String str) {
            for (Foo foo : values()) {
                if (!DEFAULT_FOO.equals(foo) && foo.matcher.matches(foo.string, str)) return foo;
            }
            return DEFAULT_FOO;
        }

        private final String string;

        private final MatchType matcher;

        private final String fullName;

        private Foo(String string, MatchType matcher) {
            this.string = string;
            this.matcher = matcher;
            fullName = name().replaceAll("_", " ");
        }

        public String getIdSuffix() {
            return "_" + fullName;
        }

        public String getNameSuffix() {
            return " (" + fullName + ")";
        }
    }

    private static class KeyNameObjectRowMapper implements RowMapper<KeyNameObject> {

        private static String putTheAtTheEnd(String name) {
            if (name.toLowerCase().startsWith("the")) {
                String str = name.substring(0, "the".length());
                return name.substring("the".length() + 1, name.length()) + ", " + str;
            }
            return name;
        }

        @Override
        public KeyNameObject mapRow(ResultSet rs, int rowNum) throws SQLException {
            KeyNameObject kno = new KeyNameObject();
            String string = rs.getString("tbl2_string");
            Foo mt = Foo.forString(string);
            kno.setId(rs.getString("tbl1_id") + mt.getIdSuffix());
            kno.setName(putTheAtTheEnd(rs.getString("tbl1_name")) + mt.getNameSuffix());
            return kno;
        }

    }

    private static final RowMapper<KeyNameObject> RM = new KeyNameObjectRowMapper();

    private static final Comparator<KeyNameObject> CMP = new CaseInsensitiveNameComaparator();

    private final JdbcTemplate db;

    public LegacyDAORefactored() {
        db = new JdbcTemplate((DataSource) StaticBeanFactory.getBean(DataSource.class));
    }

    public List<KeyNameObject> listSubsetOfRowsFromTbls1And2(int someId) {
        List<KeyNameObject> res = db.query("select distinct tbl1_id, tbl1_name, tbl2_string from tbl1 join tbl2 on tbl2_subscriber_id = tbl1_subscriber_id where (tbl2_some_id = 9005 or (tbl2_some_id = ? and tbl2_string in ('FOO','BAR'))) and tbl2_start_date < sysdate and tbl2_end_date > sysdate and not regexp_like(tbl2_string, '^.*_SOMETHING$') order by tbl1_name", RM, someId);
        Collections.sort(res, CMP);
        return res;
    }

    public List<KeyNameObject> listRowsFromTbls1And2(int someId) {
        List<KeyNameObject> res = db.query("select distinct tbl1_id, tbl1_name, tbl2_string from tbl1 join tbl2 on tbl2_join_key = tbl1_join_key where (tbl2_some_id = 9005 or (tbl2_some_id = ? and tbl2_string in ('FOO','BAR'))) and tbl2_start_date < sysdate and tbl2_end_date > sysdate order by tbl1_name", RM, someId);
        Collections.sort(res, CMP);
        return res;
    }

    public List<KeyNameObject> listRowsFromTbl3() {
        return db.query("select tbl3_id, tbl3_name from tbl3 where tbl3_boolean = 1 order by tbl3_name", new RowMapper<KeyNameObject>() {

            @Override
            public KeyNameObject mapRow(ResultSet rs, int rowNum) throws SQLException {
                KeyNameObject kno = new KeyNameObject();
                kno.setId(rs.getString("tbl3_id"));
                kno.setName(rs.getString("tbl3_name"));
                return kno;
            }

        });
    }

}

Let’s go through some of the changes that were sup­ported by the inte­gra­tion test, then …

  1. The Comparator imple­men­ta­tion required to do the final sort­ing of the result on the basis of mutated name val­ues obtained from the data­base is used by more than one method. I there­fore cre­ate a nested class for this and, because it is thread­safe, ref­er­ence it via a sta­tic, final mem­ber on the DAO class.
  2. The many if ... else if ... else state­ments used to eval­u­ate and trans­form the name val­ues obtained from the data­base are encap­su­lated into a cou­ple of enums using a tech­nique I am very fond of whereby we have an abstract method in the enum which allows each value to encap­su­late small dif­fer­ences in logic. Here, for exam­ple, I have a Matcher enum which expresses the 2 dif­fer­ent meth­ods use to com­pare strings. I sub­se­quently cre­ate a Foo enum that con­tains the var­i­ous name val­ues I expect from the data­base. Each Foo is instan­ti­ated with the required Matcher type. This, in turn, means that we can now …
  3. Intro­duce a sin­gle RowMapper imple­men­ta­tion shared by the 2 main data access meth­ods (again, ref­er­enced via a sta­tic, final mem­ber on the DAO class) that can cal­cu­late the desired muta­tions of the name and ID val­ues obtained from the data­base in a sin­gle state­ment, sim­ply by call­ing Foo.forString(string_from_database). Nice. All the details of the enu­mer­ated val­ues and their respec­tive strate­gies for match­ing are encap­su­lated within the enums. I strongly rec­om­mend this way of using Java enums as a form of imple­men­ta­tion of the strat­egy pat­tern wher­ever pos­si­ble. It really is very neat.
  4. We strongly-type the API by intro­duc­ing gener­ics into the col­lec­tion return types. Quick and easy. (It also then becomes pos­si­ble to sub­se­quently remove some casts in call­ing code, which gives you the real gain here.)
  5. The very sim­ple “table 3″ method is changed into a very stan­dard JdbcTemplate method with an inline row map­per. Again, a very easy change but one that really must be sup­ported by the kind of test shown above to ensure that you don’t sub­tly alter the out­put in some unan­tic­i­pated way.
  6. The absurd and unused “hard coded data” method is sim­ply removed with an IDE refac­tordont’t just delete it!
  7. The DAO no longer extends AbstractLe­ga­cy­DAO, get­ting rid of one more depen­dency on a heav­ily dep­re­cated class. The just-plain-wrong imple­men­ta­tions of the inher­ited abstract meth­ods from that class could con­se­quently be removed, too.
  8. The class no longer “caches” the results of the query as instance mem­bers. Because we know that the class is only called once (and under no cir­cum­stances should we be adding new usages of this class), it is unec­es­sary to do this and bet­ter that this is sim­ply removed. So I removed it. Besides which, “embed­ding” the con­cern of “caching” directly in the DAO in this man­ner, aside from being done in an extremely poor way, is wrong: if it turns out later that we need this, it can be intro­duced as a cross-cutting con­cern using reli­able and robust techniques.

Clearly, the changes that were made did not address many of the big­ger archi­tec­tural issues such as magic num­bers, data struc­ture issues that require manip­u­la­tion of per­sis­tent val­ues on-read, the fact that this DAO is instan­ti­ated and called only within a “view helper”, nor its obtain­ing a DataSource via a dep­re­cated sta­tic lookup rather than being depen­dency injected. Nonethe­less, I hope you agree that the refac­tored ver­sion of the code is, given the speed and ease of the changes (once the test was in place), con­sid­er­ably improved. Con­se­quently, deal­ing with the big­ger issues at a later date, under the aus­pices of work which would grant that scope, becomes fea­si­ble. Cer­tainly, I found that chang­ing the test in a sim­ple way and sub­se­qurntly intro­duc­ing the requested func­tional changes was far eas­ier and more reli­able as a result. This is the every­day bread & but­ter of refac­tor­ing … improve, improve, improve, each time a class is mod­i­fied so that you get you code­base into a posi­tion where it becomes pos­si­ble to safely han­dle big­ger and big­ger issues, architecturally-speaking. Some­times it can feel like you are get­ting nowhere and, if you have peo­ple in your team writ­ing new crap faster than you can clean it up, then you can encounter prob­lems. How­ever, even a small team of good devel­op­ers united and act­ing con­sis­tently in this way can effect remark­able improve­ments over even large code­bases in a surpis­ingly short amount of time.

On June 6, 2011 christopher wrote: On Test Harnesses & Testing Batch Emailers With Dumbster

One of my many rather sad obses­sions is the sub­ject of test har­nesses: I am a strong advo­cate of the argu­ment that, if you are unable to cleanly artic­u­late a har­ness that iso­lates the run­time inte­gra­tion points of a set of code units such that one can write a good end-to-end test, then your under­stand­ing of your appli­ca­tion archi­tec­ture is defi­cient in a crit­i­cal way.

Batch email­ing pro­vides a good exam­ple of how use­ful this under­stand­ing can be. Most com­pa­nies have some kind of “newslet­ter” that they send out to who­ever signs up for it (and they would usu­ally like as many peo­ple as pos­si­ble to sign-up, for obvi­ous com­mer­cial rea­sons). I have encoun­tered real-world sit­u­a­tions where the imple­men­ta­tion of such sys­tems starts out work­ing fine for a com­par­a­tively small num­ber of ini­tial sub­scribers but, as the “newslet­ter” becomes more and more heav­ily subscribed-to, things start to go hor­ri­bly wrong: it con­sumes more and more sys­tem resources; poor data access grid­locks the data­base; the task slows to a halt and can­not even com­plete the batch before it is due to run again (if it even gets that far). In other words, you can encounter some clas­sic scal­a­bil­ity issues. There­fore, whilst the tech­ni­cal prob­lem itself is fairly triv­ial, we can get some good test­ing sce­nar­ios out of such an app:

  1. The set of poten­tial sub­scribers is unbounded. There­fore, scal­a­bil­ity con­cerns should indi­cate the neces­sity to prove the appli­ca­tion via “soak test­ing” which can assert that a thresh­old num­ber of emails n can be deliv­ered within a given time period t. This can have the addi­tional ben­e­fit of pro­vid­ing scal­a­bil­ity met­rics that can be used to project when addi­tional mod­i­fi­ca­tions, such as clus­ter­ing and/or strip­ing might become necessary.
  2. You often need to be able to gen­er­ate a real­is­tic, but ran­dom, infi­nite set of data to accu­rately repli­cate the dynamic nature of the sys­tem at run­time (e.g. do all emails have the same con­tent? If not, you are prob­a­bly hav­ing to go to the data­base every so often, if not for each email, to get the required data — this needs to be repli­cated to prop­erly soak test the system).
  3. You prob­a­bly also want to be able to make a rea­son­able stab at repli­cat­ing the mail spool which, given 100,000s of mes­sages may well itself become a point of con­tention within the sys­tem (i.e. more than just mock­ing a JavaMailSender). How­ever, you need to do so with­out any risk of send­ing a real email to a real recip­i­ent and with the facil­ity to make asser­tions about the email that is enqueued.

In terms of gen­eral abstrac­tions (ignor­ing con­tex­tual opti­mi­sa­tions, such as caches), the logic of such sys­tems is usu­ally pretty standard:

  1. A batch task execu­tor (e.g. cron or a Java sub­sti­tute, such as Quartz).
  2. A batch task to execute.
  3. A sys­tem of record for subscribers.
  4. A sys­tem of record for email data.
  5. An email ren­derer (i.e. a tem­plate engine such as Veloc­ity or Freemarker).
  6. A mail spool (i.e. an SMTP server)
Batch E-mailer Architecture

Archi­tec­tural descrip­tion of generic batch e-mailer application

The inte­gra­tion points for such a sim­ple archi­tec­ture are straight­foward: they are rep­re­sented by the point at which the lines from grey com­po­nents (extrin­sic to the sys­tem itself) cross the sys­tem thresh­old, rep­re­sented by the dotted-line-box.

Now, for the sake of argu­ment (and to make the prob­lem a lit­tle more inter­est­ing), let’s say that the con­tent of the emails per-subscriber is not uni­form (ignor­ing obvi­ous essen­tial dif­fer­ences, such as salu­ta­tions etc). What this tends to mean in prac­tice is that one or more parts of the sub­scriber descrip­tion data is a para­me­ter to the query used to obtain e-mail data. As men­tioned above, what this means from a test­ing per­spec­tive is that we will need to ensure that this aspect of the appli­ca­tion is accounted for within our test har­ness: a test which sim­ply returns uni­form e-mail con­tent will prob­a­bly not be suf­fi­ciently accu­rate. This might give us a test har­ness some­thing like the following:

Batch E-mailer architecture test harness

Inte­gra­tion points and test­ing approaches

  1. Our test cases them­selves will become the task executor.
  2. With all trans­ac­tions set to roll­back, we can use a real data source (although obvi­ously not your real live appli­ca­tion data unless you are very brave [for “brave” there, read “stu­pid”]). Using the real data source will help us to get prop­erly rep­re­sen­ta­tive data for the test cases whilst using roll­back will ensure that the data remains the same for the next time the tests are run. More­over, hav­ing a full inte­gra­tion test using a real data­base will repli­cate that load — you may, for instance, want to enable some form of data­base activ­ity log­ging so that asser­tions can be made about such things, too. How­ever, we will almost cer­tainly need to proxy the data sources so that, where insuf­fi­cient “real data” is avail­able for a par­tic­u­lar test case (e.g. a test case which says “keep send­ing emails for a given period of time” … which means the num­ber is unknown) so that we can gen­er­ate ran­dom, but nonethe­less rep­re­sen­ta­tive, data on demand.
  3. Finally, we can use Dumb­ster as a fake mail server. How­ever, whilst Dumb­ster is a very use­ful piece of kit, it has some lim­i­ta­tions for use in this kind of sce­nario: the sent emails will accu­mu­late in mem­ory until “received” by the test case. Con­se­quently, for large batch “soak” tests such as we are dis­cussing here, it is nec­es­sary to flush the server occa­sion­ally dur­ing the test exe­cu­tion in order to pre­vent out-of-memory excep­tions. Addi­tion­ally, there­fore, we also need to be able to “inject” asser­tions into our mail server because there will be no way we can aggre­gate all mail after the test has com­pleted and make asser­tions about it then with­out run­ning into the same mem­ory issues.

Obvi­ously, you should be writ­ing your tests first but let’s begin by look­ing at a sketch of my sug­gested imple­men­ta­tion for the batch emailer itself, so that there is some con­text for the test har­ness exam­ples that fol­low. We begin with a basic java.lang.Runnable for the batch task:

@Component public class SpringBatchMailerTask implements BatchMailerTask {

    private final EmailContentDAO emailContentDAO;

    private final EmailRenderer emailRenderer;

    private final JavaMailSender emailSender;

    private final SubscriberDAO subscriberDAO;

    @Autowired public SpringBatchMailerTask(EmailContentDAO emailContentDAO, EmailRenderer emailRenderer, JavaMailSender emailSender, SubscriberDAO subscriberDAO) {
        this.emailContentDAO = emailContentDAO;
        this.emailRenderer = emailRenderer;
        this.emailSender = emailSender;
        this.subscriberDAO = subscriberDAO;
    }

    @Transactional(propagation = Propagation.REQUIRES_NEW) public void processSubscriber(Subscriber subscriber) {
        EmailContent content = emailContentDAO.getEmailContent(subscriber);
        String body = emailRenderer.render(subscriber, content);
        MimeMessagePreparator mail = new BatchMailerMimeMessagePreparator(s.getEmail(), content.getFromEmail(), content.getSubject(), body, subscriber.isHtmlEmailSubscription());
        emailSender.send(mail);
    }

    public void run() {
        dao.executeForSubscribers(this);
    }

}

Whilst I usu­ally like to extract my inter­faces through refac­tor­ing, I have pre-emptively added in a BatchMailerTask inter­face in the code here to avoid rep­e­ti­tion. The inter­face looks like this:

public interface BatchMailerTask extends Runnable, SubscriberCallbackHandler {}

As you can see, the inter­face merely aggre­gates java.lang.Runnable and another sep­a­rate cus­tom SubscriberCallbackHandler row call­back inter­face that looks like this:

public interface SubscriberCallbackHandler {

    void processSubscriber(Subscriber subscriber);

}

The ratio­nale for this is sim­ple: because we are deal­ing with a poten­tially large, unbounded data set, there is no ques­tion of load­ing all the data into mem­ory as a col­lec­tion. We will need to iter­ate over any result set, deal­ing with each row singly to avoid poten­tial exces­sive mem­ory usage. The cus­tom call­back inter­face there­fore serves as a strongly-typed facade to (in this case) Spring’s RowCallbackHandler. This allows us to have a clean SubscriberDAO imple­men­ta­tion that, if we were using some sim­ple JDBC, might look some­thing like the following:

@Repository public class JdbcSubscriberDAO implements SubscriberDAO {

    private static final class SubscriberRowCallbackHandler implements RowCallbackHandler {

        private static final RowMapper<Subscriber> RM = new SubscriberRowMapper();

        private final SubscriberCallbackHandler callback;

        SubscriberRowCallbackHandler(SubscriberCallbackHandler callback) {
            this.callback = callback;
        }

        public void processRow(ResultSet rs) throws SQLException {
            callback.processSubscriber(RM.mapRow(rs, rs.getRow()));
        }

    }

    private JdbcTemplate t;

    private final String sql;

    @Autowired public JdbcSubscriberDAO(DataSource dataSource) throws IOException {
        t = new JdbcTemplate(dataSource);
        sql = IOUtils.toString(getClass().getResourceAsStream("/com/christophertownson/mail/dao/find-subscribers.sql"), "UTF-8");
    }

    public void executeForSubscribers(SubscriberCallbackHandler callback) {
        t.query(sql, new UserSavedQueryRowCallbackHandler(callback));
    }

}

And there you have it. Whilst we’re on the sub­ject of DAOs, let’s move on to look at the prox­y­ing and ran­dom gen­er­a­tion of data. Prox­y­ing is a clas­sic AOP use case, so I am going to use an aspect for this:

@Component @Aspect public class SubscriberDAOAdvisor {

    private JdbcTemplate db;

    private long numberOfEmailsToSend = 500000;

    private long realSubscriberCount;

    private boolean useRealSubscribersFirst = false;

    @Autowired public SubscriberDAOAdvisor(DataSource dataSource) {
        db = new JdbcTemplate(dataSource);
    }

    @Around("execution(* com.christophertownson.dao.SubscriberDAO.executeForSubscriber(..))") public Object feedDataToCallback(ProceedingJoinPoint pjp) throws Throwable {
        SubscriberCallbackHandler callback = (SubscriberCallbackHandler) pjp.getArgs()[0];
        long numberOfRandomSubscribersToGenerate = getNumberOfRandomSubscribersToGenerate();
        if (useRealSubscribersFirst) pjp.proceed();
        long sent = 0;
        while (sent < numberOfRandomSubscribersToGenerate) {
            callback.processSubscriber(Fixtures.randomSubscriber());
            sent++;
        }
        return null;
    }

    public void setNumberOfEmailsToSend(long numberOfEmailsToSend) {
        this.numberOfEmailsToSend = numberOfEmailsToSend;
    }

    public void setUseRealSubscribersFirst(boolean useRealSubscribersFirst) {
        this.useRealSubscribersFirst = useRealSubscribersFirst;
    }

    private long getNumberOfRandomSubscribersToGenerate(Date publishedBefore) throws Exception {
        if (!useRealSubscribersFirst) return numberOfEmailsToSend;
        realSubscriberCount = db.queryForLong(IOUtils.toString(getClass().getResourceAsStream("/count-subscribers.sql")));
        long numberOfRandomSubscribersToGenerate = numberOfEmailsToSend < realSubscriberCount ? numberOfEmailsToSend - realSubscriberCount : 0;
        return numberOfRandomSubscribersToGenerate;
    }

}

This class is state­ful but that is fine here because we get a new instance for each test case / task exe­cu­tion. Because we are using a call­back style, we inter­cept the first DAO call (to which the row call­back is passed): we can then feed the call­back either real data, fake ran­dom data, or a mix­ture of both (real data sup­ple­mented by fake data).

I shouldn’t really need to tell you what the Fixtures.randomSubscriber() does but, for your amuse­ment, I’ll show you what my basic, generic, ran­dom data gen­er­a­tor looks like so you get the gen­eral idea:

public final class Fixtures {

    private static final String[] TLD = { "biz", "com", "coop", "edu", "gov", "info", "net", "org", "pro", "co.uk", "gov.uk", "ac.uk" };

    public static boolean randomBoolean() {
        return randomLong() % 2 == 0;
    }

    public static Date randomDate() {
        return new Date(randomLong());
    }
    
    public static Date randomDate(Date start, Date end) {
        return new Date(randomLong(start.getTime(), end.getTime()));
    }

    public static String randomEmail() {
        return randomString(1) + "." + randomString(8) + "@" + randomString(8) + "." + randomTopLevelDomain();
    }

    public static Integer randomInt() {
        return randomInt(Integer.MIN_VALUE, Integer.MAX_VALUE);
    }

    public static Integer randomInt(int min, int max) {
        return randomLong(min, max).intValue();
    }

    public static Long randomLong() {
        return randomLong(Long.MIN_VALUE, Long.MAX_VALUE);
    }

    public static Long randomLong(long min, long max) {
        return min + (long) (Math.random() * ((max - min) + 1));
    }

    public static String randomString(int length) {
        byte[] bytes = new byte[length];
        for (int i = 0; i < length; i++) {
            bytes[i] = randomLong(97, 122).byteValue(); // let's just stick to lower-alpha, shall we?
        }
        return new String(bytes);
    }

    public static String randomTopLevelDomain() {
        return TLD[randomInt(0, TLD.length - 1)];
    }

    public static String randomHttpUrl() {
        return "http://" + randomString(8) + "." + randomTopLevelDomain();
    }

    private Fixtures() {}

}

Need­less to say, the com­plex type returned by Fixtures.randomSubscriber() would just need to be com­posed from amongst the rel­e­vant sim­ple types above.

Prox­y­ing and gen­er­a­tion of ran­dom data for the EmailContentDAO.getEmailContent(Subscriber) is even more straightforward:

@Component @Aspect public class EmailContentDAOAdvisor {

    private boolean useRealContent = false;

    @Around("execution(* com.christophertownson.dao.EmailContentDAO.getEmailContent(..))") public Object returnRandomEmailContent(ProceedingJoinPoint pjp) throws Throwable {
        Subscriber subscriber = (Subscriber) pjp.getArgs()[0];
        EmailContent content = null;
        if (useRealContent) content = (EmailContent) pjp.proceed();
        if (content == null) content = Fixtures.randomEmailContent(subscriber); // we can use subscriber values to "seed" random content, if necessary
        return content;
    }

    public void setUseRealContent(boolean useRealContent) {
        this.useRealContent = useRealContent;
    }
}

The final com­po­nent in our test har­ness is the SMTP server itself. As I said, for this I will be using a wrapped Dumb­ster server. My first cut looked a lit­tle like this:

@Component("smtpProxy") public class SmtpProxy {

    private MailAssertionListener[] assertionListeners = {};

    private final int port;

    private SimpleSmtpServer smtp;

    private int totalNumberOfEmailsSent = 0;

    @Autowired public SmtpProxy(@Value("${smtp.port}") int port) {
        this.port = port;
    }

    public void flush() {
        stop();
        @SuppressWarnings("unchecked") Iterator<SmtpMessage> messages = smtp.getReceivedEmail();
        while (messages.hasNext()) {
            SmtpMessage msg = messages.next();
            Email email = new Email(msg); // Email class here is a simple adapter for SmtpMessage
            totalNumberOfEmailsSent++;
            if (assertionListeners != null && assertionListeners.length > 0) {
                for (MailAssertionListener assertion : assertionListeners) {
                    assertion.doAssertion(email);
                }
            }
        }
        start();
    }

    public int getTotalNumberOfEmailsSent() {
        return totalNumberOfEmailsSent;
    }

    @Autowired(required = false) public void setAssertionListeners(MailAssertionListener[] assertionListeners) {
        this.assertionListeners = assertionListeners;
    }

    @PostConstruct public void start() {
        smtp = SimpleSmtpServer.start(port);
    }

    @PreDestroy public void stop() {
        smtp.stop();
    }

}

Even bet­ter than using Dumb­ster here would be to imple­ment as an exten­sion into a real, embed­d­a­ble SMTP server (if some­one would like to vol­un­teer an imple­men­ta­tion based on some­thing like James, please do!) because then it would not be nec­es­sary to flush like this at inter­vals. Nev­er­the­less, this basic imple­men­ta­tion is “good enough” for the time being. If we need to inject asser­tions, we can do so either by imple­ment­ing the MailAssertionListener inter­face and inject­ing either man­u­ally or via Spring (note: the asser­tions here must be true of all emails sent for a given test case, so divide up your test cases and injected asser­tions accordingly):

public interface MailAssertionListener {

    void doAssertion(Email sent);

}
@Component public class ValidEmailAddressAssertionListener {

    public void doAssertion(Email sent) {
        assertThat(EmailValidator.getInstance().isValid(sent.getRecipient()), is(true));
    }

}

Last, but by no means lest, we just need to tie our lit­tle test har­ness together with some of those oblig­a­tory pointy brackets:

<?xml version="1.0" encoding="UTF-8"?>
<beans xmlns="http://www.springframework.org/schema/beans"
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xmlns:context="http://www.springframework.org/schema/context"
    xmlns:task="http://www.springframework.org/schema/task"
    xsi:schemaLocation="http://www.springframework.org/schema/beans http://www.springframework.org/schema/beans/spring-beans-3.0.xsd
        http://www.springframework.org/schema/context http://www.springframework.org/schema/context/spring-context-3.0.xsd
        http://www.springframework.org/schema/task http://www.springframework.org/schema/task/spring-task-3.0.xsd">

    <context:property-placeholder location="classpath:test.properties" system-properties-mode="OVERRIDE" />

    <context:component-scan base-package="com.christophertownson" />

    <bean id="mailSender" class="org.springframework.mail.javamail.JavaMailSenderImpl">
        <property name="host" value="localhost" />
        <property name="port" value="${smtp.port}" />
    </bean>

    <task:scheduler id="taskScheduler" />

    <task:scheduled-tasks scheduler="taskScheduler">
        <task:scheduled ref="smtpProxy" method="flush" fixed-delay="${smtp.proxy.flushInterval}" />
    </task:scheduled-tasks>

</beans>

All of the above puts us in a posi­tion to write a “soak” test case such as …

@RunWith(SpringJUnit4ClassRunner.class)
@ContextConfiguration(locations = {"/applicationContext-test.xml"})
@Transactional
@TransactionConfiguration(defaultRollback = true)
public class BatchMailerTaskSoakTest {

    @Autowired private BatchMailerTask task;

    @Autowired private SmtpProxy smtp;

    @Autowired private SubscriberDAOAdvisor subscriberAdvisor;

    @Autowired private EmailContentDAOAdvisor emailAdvisor;

    @Test(timeout = 86400000) public void shouldBeAbleToSpamOneMillionPeoplePerDay() {
        subscriberAdvisor.setNumberOfEmailsToSend(1000000);
        subscriberAdvisor.setUseRealSubscribersFirst(true);
        emailAdvisor.setUseRealContentFirst(true);
        task.run(); // the test case is the batch task executor
        smtp.flush(); // one final flush before assertions are made
        assertThat(smtp.getTotalNumberOfEmailsSent(), is(1000000));
    }

}

Given that the test passes, we now know that our imple­men­ta­tion would be capa­ble of spam­ming at least 1,000,000 peo­ple per day with­out break­ing a sweat, so long as our real mail server were also up to the task … All we need to do now is check that our “unsub­scribe” func­tion­al­ity also scales accordingly!

Nat­u­rally, a test such as this is not the kind of thing you run as part of your reg­u­lar build. Run­ning poten­tially for a whole day is also some­what extreme. Nonethe­less, this kind of test can be extremely valu­able for test­ing long-running batch tasks. More­over, the kind of test har­ness you can get out of this can have more gen­eral applic­a­bil­ity. For exam­ple, with some func­tional addi­tions, I cur­rently use a ver­sion of the SmtpProxy in devel­op­ment envi­ron­ments to ensure that mail can never get out to real users: every­thing is either dumped as a file to an inbox folder or for­warded to a pre-configured email address (if the recip­i­ent is not con­tained within a recip­i­ent whitelist). This puts an end to such fool­ish­ness as code explic­itly check­ing to see whether it is in a “dev” envi­ron­ment and branch­ing accord­ingly because the environment-specific behav­iour that is desired in such cir­cum­stances is obtained in a man­ner that is com­pletely exter­nal to the appli­ca­tion itself which need know noth­ing about it (not even the con­fig­u­ra­tion need change) and sim­i­lar approaches can be adopted for pretty much any pro­to­col (FTP, HTTP etc).

On May 16, 2011 christopher wrote: End-to-End Testing With Maven, Jetty & JBehave

One of those seem­ingly triv­ial top­ics of debate amongst soft­ware devel­op­ers which is liable to irk me is the sub­ject of depen­den­cies. There is noth­ing more frus­trat­ing than check­ing out a project only only to dis­cover that it has no end of “exter­nals” and other assorted envi­ron­men­tal depen­den­cies that are out­side of the con­trol of the project itself. Ide­ally, the struc­ture and build sup­port for an exe­cutable project (e.g. a web appli­ca­tion) should make it pos­si­ble for it to be run “out-of-the-box” because three sig­nif­i­cant costs are incurred where this is not the case:

  1. Time-to-start” for any new devel­oper is increased.
  2. The appli­ca­tion becomes more frag­ile through expo­sure to change in exter­nal dependencies.
  3. End-to-end func­tional, in-container test­ing of the appli­ca­tion becomes cor­re­spond­ingly more com­plex, the setup process for which now has to effec­tively doc­u­ment and keep pace with changes to exter­nal sys­tems (which, in prac­tice, are often not known until after your tests start fail­ing, lead­ing to wasted time debug­ging the cause of failure).

The argu­ment I some­times hear against my approach is that it attempts to cre­ate a mono­lithic sys­tem (the “one ring to rule them all” syn­drome) and that sep­a­ra­tion into “mod­ules” helps to cre­ate smaller, more man­age­able appli­ca­tions (smaller: maybe; more man­age­able: def­i­nitely not, in my view). “Mod­u­lar­i­sa­tion” and “service-oriented” approaches need not incur the loss of compile-time safety (which is an advan­tage of a lan­guage like Java) nor the frag­men­ta­tion of inte­gra­tion con­cerns. To demon­strate this, and for my own enjoy­ment and edu­ca­tion, I have recently been work­ing in my spare time on putting together a small web appli­ca­tion that I call “Appo­site”. In this and sub­se­quent posts, I would like to share with you what I see as some of the key struc­tural and archi­tec­tural approaches that I am adopt­ing, begin­ning with the assump­tion that the entire appli­ca­tion must always remain exc­etable as a stand­alone arte­fact using just the build script so that it is pos­si­ble to do both end-to-end func­tional test­ing of the appli­ca­tion as part of the build and to make it pos­si­ble to do “live cod­ing” (where changes in com­piled code are quickly vis­i­ble within a run­ning instance.

The appli­ca­tion itself is a very con­ven­tional Java web appli­ca­tion build using the now-standard stack of Spring and Hiber­nate. For builds, I am using Maven. For test­ing, I am using JUnit (of course), Web­Driver, and JBe­have (whilst arguably not as good as cucum­ber it is eas­ier to use with Java projects and fits more nicely with my “out-of-the-box” non-functional require­ments). As you can see, hardly ground­break­ing stuff … but some­thing I still see done “wrong” in so many places and by so many peo­ple (which is some­what inve­vitable by virtue of its popularity).

Get­ting the basic project struc­ture in place

Let’s begin from the very basics, assum­ing a stan­dard Maven web appli­ca­tion project structure:

  • appo­site
    • pom.xml
    • src
      • main
        • java
        • resources
        • webapp
          • WEB-INF
            • web.xml
      • test
        • java
        • resources

In the pom.xml, we declare the test­ing depen­den­cies we are going to be using, start­ing with JUnit:

        <dependency>
            <groupId>junit</groupId>
            <artifactId>junit</artifactId>
            <version>4.8.2</version>
            <scope>test</scope>
        </dependency>
        <dependency>
            <groupId>org.jbehave</groupId>
            <artifactId>jbehave-core</artifactId>
            <version>3.3.2</version>
            <scope>test</scope>
        </dependency>
        <dependency>
            <groupId>org.seleniumhq.selenium</groupId>
            <artifactId>selenium-common</artifactId>
            <version>${selenium.version}</version>
            <scope>test</scope>
        </dependency>
        <dependency>
            <groupId>org.seleniumhq.selenium</groupId>
            <artifactId>selenium-firefox-driver</artifactId>
            <version>${selenium.version}</version>
            <scope>test</scope>
            <exclusions>
                <exclusion>
                    <groupId>commons-logging</groupId>
                    <artifactId>commons-logging</artifactId>
                </exclusion>
            </exclusions>
        </dependency>
        <dependency>
            <groupId>org.seleniumhq.selenium</groupId>
            <artifactId>selenium-support</artifactId>
            <version>${selenium.version}</version>
            <scope>test</scope>
        </dependency>
        <dependency>
            <groupId>org.springframework</groupId>
            <artifactId>spring-test</artifactId>
            <version>3.0.5.RELEASE</version>
            <scope>test</scope>
        </dependency>

Func­tional tests are inher­ently slower to run that unit tests and we do not nec­es­sar­ily want to run them all the time. There­fore, we want to exe­cute them only dur­ing Maven’s inte­gra­tion test phase and, even then, only when we spec­ify a func­tional test pro­file. To achieve this, we begin by adding the maven-failsafe-plugin to the build sec­tion of our POM:

            <plugin>
                <groupId>org.apache.maven.plugins</groupId>
                <artifactId>maven-failsafe-plugin</artifactId>
                <version>2.8</version>
            </plugin>

By default, this will run JUnit tests that match the pat­tern **/*IT.java dur­ing the inte­gra­tion test phase of the build. You can stick with the default, how­ever I pre­fer the slightly more descrip­tive nam­ing con­ven­tion **/*FunctionalTest.java — that can yield slightly over-long test names but it is at least blind­ingly clear what sort of test your test class is! To ensure my pre­ferred test nam­ing con­ven­tion does not con­flict with the stan­dard sure­fire plu­gin defaults, I con­fig­ure excludes and includes in the surefire-plugin:

            <plugin>
                <groupId>org.apache.maven.plugins</groupId>
                <artifactId>maven-surefire-plugin</artifactId>
                <version>2.8.1</version>
                <configuration>
                    <includes>
                        <include>**/*Test.java</include>
                    </includes>
                    <excludes>
                        <exclude>**/*FunctionalTest.java</exclude>
                    </excludes>
                </configuration>
            </plugin>

By default in Maven, the webapp source folder is not on the test class­path. It makes this kind of in-container func­tional test­ing much eas­ier if it is. To place the webapp folder on the class­path, you can use the build-helper-maven-plugin:

            <plugin>
                <groupId>org.codehaus.mojo</groupId>
                <artifactId>build-helper-maven-plugin</artifactId>
                <version>1.5</version>
                <executions>
                    <execution>
                        <id>add-test-resource</id>
                        <phase>generate-test-sources</phase>
                        <goals>
                            <goal>add-test-resource</goal>
                        </goals>
                        <configuration>
                            <resources>
                                <resource>
                                    <directory>src/main/webapp</directory>
                                </resource>
                            </resources>
                        </configuration>
                    </execution>
                </executions>
            </plugin>

Next in the build plu­g­ins we need to declare and con­fig­ure the jetty-maven-plugin so that we can fire-up the entire web appli­ca­tion using mvn jetty:run:

            <!-- You need to specify -Djetty.port=${port} if 8080 is already bound on the build machine -->
            <plugin>
                <groupId>org.mortbay.jetty</groupId>
                <artifactId>jetty-maven-plugin</artifactId>
                <version>7.2.0.v20101020</version>
                <dependencies>
                    <!-- you can declare what would commonly be your container-provided dependencies here, such as log4j etc -->
                </dependencies>
                <configuration>
                    <webAppConfig>
                        <contextPath>/${project.artifactId}</contextPath>
                    </webAppConfig>
                    <jettyConfig>src/test/resources/jetty.xml</jettyConfig>
                    <useTestClasspath>true</useTestClasspath>
                    <scanIntervalSeconds>10</scanIntervalSeconds>
                    <stopKey>${project.artifactId}-stop</stopKey>
                    <stopPort>9999</stopPort>
                </configuration>
            </plugin>

Note the use of the jetty.xml jet­ty­Con­fig there: I use this to declare a JNDI data­source (this needs to be a jetty server con­fig — using a jetty web appli­ca­tion con­fig will result in hor­rific mem­ory leaks around data­base con­nec­tions with the reg­u­lar restarts that you may well want if you are doing “live cod­ing” against a run­ning jetty instance using this build con­fig) so that all my app has to know is the JNDI name and the actual details of this will always be encap­su­lated within the con­tainer (in this case, the test har­ness). In Appo­site, I am using an in-memory HSQLDB data­base for test­ing, so I declare c3p0 and HSQLDB as container-provided depen­den­cies of the jetty plu­gin and my jetty.xml looks like this:

<?xml version="1.0"  encoding="UTF-8"?>
<!DOCTYPE Configure PUBLIC "-//Mort Bay Consulting//DTD Configure//EN" "http://jetty.mortbay.org/configure.dtd">
<Configure class="org.eclipse.jetty.server.Server">
    <New class="org.eclipse.jetty.plus.jndi.Resource">
        <Arg>jdbc/apposite</Arg>
        <Arg>
            <New class="com.mchange.v2.c3p0.ComboPooledDataSource">
                <Set name="driverClass">org.hsqldb.jdbcDriver</Set>
                <Set name="jdbcUrl">jdbc:hsqldb:mem:apposite</Set>
                <Set name="user">sa</Set>
                <Set name="password"></Set>
            </New>
        </Arg>
    </New>
</Configure>

The final touch in the POM is to setup a func­tional tests pro­file so that we can exe­cute the in-container tests using mvn test -PFunctionalTests (or -PWhateverYouCallYourProfile):

        <profile>
            <id>FunctionalTests</id>
            <activation>
                <activeByDefault>false</activeByDefault>
            </activation>
            <build>
                <plugins>
                    <plugin>
                        <groupId>org.apache.maven.plugins</groupId>
                        <artifactId>maven-failsafe-plugin</artifactId>
                        <configuration>
                            <includes>
                                <include>**/*FunctionalTest.java</include>
                            </includes>
                        </configuration>
                        <executions>
                            <execution>
                                <id>integration-test</id>
                                <goals>
                                    <goal>integration-test</goal>
                                </goals>
                            </execution>
                            <execution>
                                <id>verify</id>
                                <goals>
                                    <goal>verify</goal>
                                </goals>
                            </execution>
                        </executions>
                    </plugin>
                    <plugin>
                        <groupId>org.mortbay.jetty</groupId>
                        <artifactId>jetty-maven-plugin</artifactId>
                        <executions>
                            <execution>
                                <id>start-jetty</id>
                                <phase>pre-integration-test</phase>
                                <goals>
                                    <goal>run</goal>
                                </goals>
                                <configuration>
                                    <scanIntervalSeconds>0</scanIntervalSeconds>
                                    <daemon>true</daemon>
                                </configuration>
                            </execution>
                            <execution>
                                <id>stop-jetty</id>
                                <phase>post-integration-test</phase>
                                <goals>
                                    <goal>stop</goal>
                                </goals>
                            </execution>
                        </executions>
                    </plugin>
                </plugins>
            </build>
        </profile>

In this pro­file, we tell the jetty plu­gin to fire up our webapp just before the inte­gra­tion test phase starts (and to stop it once it is com­plete) and inform the fail­safe plu­gin that it should exe­cute tests that match the nam­ing con­ven­tion **/*FunctionalTest.java (you could, of course, avoid hav­ing to have this bit of con­fig­u­ra­tion by sim­ply accept­ing the default **/*IT.java con­ven­tion … but I have some­thing of a dis­like of “pub­lic abbre­vi­a­tions” like this). Note also that the scanIntervalSeconds prop­erty is set to 0: we do not want jetty acci­den­tally detect­ing some change on the class­path due to code gen­er­at­ing some resource there and restart­ing mid-test as a con­se­quence. Set­ting this prop­erty to 0 ensures this by turn­ing off the jetty plu­gin change scan. This over­rides the set­ting in the build plu­gin con­fig­u­ra­tion (where we set it to 10 sec­onds) which was intended to have the oppo­site effect: when we do “live cod­ing” against a run­ning jetty instance, we want it to pick-up and deploy our code changes regularly.

As usual with Maven, there is a bit of an excess of pointy brack­ets here … but once you have a use­ful project struc­ture you can always turn it into an arche­type, thus avoid­ing the need to have to recre­ate it by hand every time.

Cre­at­ing a “frame­work” for the func­tional tests

We now have a web appli­ca­tion (albeit with no actual code) that we can fire up and run from our build script and which will auto­mat­i­cally run cer­tains tests in-container dur­ing the build if and when we so choose. Before we plough ahead, we should stop and give a lit­tle thought to how we want to divide up our func­tional tests and what kind of sup­port infra­struc­ture they might ben­e­fit from.

The first thought that occurred to me at this point is “I’m using Spring already so surely I must be able to re-utilise it to make man­ag­ing my test code eas­ier?” This is indeed pos­si­ble but, if you are using annotation-based Spring con­text con­fig­u­ra­tion (as I am), then it is a very good idea to use a com­pletely sep­a­rate name­space for your func­tional test “con­text” to ensure there is no chance of it becom­ing mixed up with your real appli­ca­tion. In the case of Appo­site, my appli­ca­tion name­space is org.apposite. There­fore, rather than use org.apposite.bdd (or sim­i­lar), I opted for bdd.org.apposite: no chance of a con­flict, as it is not a sub­set of the appli­ca­tion name­space. I began by mak­ing a min­i­mal appli­ca­tion con­text con­fig that would pro­vide pretty much all the sup­port­ing infra­struc­ture I would need for my tests:

<?xml version="1.0" encoding="UTF-8"?>
<beans xmlns="http://www.springframework.org/schema/beans" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xmlns:context="http://www.springframework.org/schema/context"
    xsi:schemaLocation="http://www.springframework.org/schema/beans http://www.springframework.org/schema/beans/spring-beans-3.0.xsd
    http://www.springframework.org/schema/context http://www.springframework.org/schema/context/spring-context-3.0.xsd">

    <context:property-placeholder location="classpath:application.properties,classpath:environment.properties" system-properties-mode="OVERRIDE" />

    <context:component-scan base-package="bdd.org.apposite" />

    <bean id="web" class="org.openqa.selenium.firefox.FirefoxDriver" scope="prototype" destroy-method="close" />

</beans>

This file achieves the following:

  1. Pro­vides access to “appli­ca­tion” and “envi­ron­ment” prop­er­ties (of the real appli­ca­tion) from src/main/resources/application.properties and src/test/resources/environment.properties, respec­tively, so that these val­ues can be utilised to con­struct test cases and asser­tions. Note that the environment.properties defines prop­er­ties spe­cific to the con­tainer or the envi­ron­ment and would nor­mally be pro­vided by the tar­get deploy­ment con­tainer. The ver­sion in src/test/resources there­fore repli­cates this require­ment for the test con­tainer whilst also serv­ing as a form of doc­u­men­ta­tion, thus help­ing me to keep these two dis­tinct areas of con­fig­u­ra­tion entirely seper­ate whilst main­tain­ing runnabil­ity “out-of-the-box”.
  2. Instan­ti­ates anno­tated com­po­nents within the test name­space only via context:component-scan.
  3. Pro­vides access to Web­Driver (I’m using the fire­fox dri­ver here). Notice that the scope is prototype so that each test will get it’s own new instance. Here I also spec­ify a destroy-method — that is a bit of “belt & braces” para­noia to try and ensure that we do not leave Fire­fox win­dows open on com­ple­tion of a test case (it doesn’t actu­ally achieve that due to the nature of the Spring bean life­cy­cle in rela­tion to the test exe­cu­tion, but out of pure super­sti­tion I felt it bet­ter defined than not, if you know what I mean).

Next, I started think­ing about how I wanted to divide up my tests. This is worth doing if you are using some­thing like JBe­have because one of the first things you will need to do is to tell it how to load “story files”. Con­se­quently, know­ing which story files to load is impor­tant. If you sim­ply load all your story files in one go (which is an option) you will have one mas­sive func­tional test. That may not nec­es­sar­ily be a prob­lem, but it does limit things some­what, espe­cially in terms of report­ing, and may quickly become unwieldy for any­thing but the small­est and most sim­ple of appli­ca­tions. In line with gen­eral BDD guide­lines, I wanted to split my tests up into “func­tional areas” within which one or more user sce­nar­ios could be encap­su­lated (for exam­ple, “reg­is­tra­tion”). I opted to go for a one-to-one rela­tion between a func­tional test class and story file because then, once a basic test exe­cu­tion mech­a­nism was in place, I would sim­ply be able to write the story file and cre­ate a sim­ple test class that spec­i­fied the story file to run. I would then see these as indi­vid­u­ally exe­cuted tests within both the Maven inte­gra­tion test phase and in the JBe­have report­ing. I felt the cost of hav­ing to pro­duce at least one “no code” class per story file was off­set by the flex­i­bil­ity this approach would pro­vide. After writ­ing a few tests, I extracted the fol­low­ing through a process of refactoring:

@RunWith(SpringJUnit4ClassRunner.class)
@ContextConfiguration("classpath:bdd/org/apposite/functional-tests.xml")
public abstract class AbstractFunctionalTest extends Embedder {

    @Autowired private GenericApplicationContext ctx;

    private final String includes;

    protected AbstractFunctionalTest(String storyFile) {
        includes = "bdd/org/apposite/" + storyFile;
    }

    @Test public void runStories() {
        useCandidateSteps(new InstanceStepsFactory(configuration().useStoryReporterBuilder(new StoryReporterBuilder().withCodeLocation(codeLocationFromClass(getClass())).withFormats(CONSOLE, TXT, XML, HTML)), ctx.getBeansWithAnnotation(Steps.class).values().toArray()).createCandidateSteps());
        runStoriesAsPaths(new StoryFinder().findPaths(codeLocationFromClass(getClass()), includes, ""));
    }

}

This class extends org.jbehave.core.embedder.Embedder, which is the key thing in enabling it to become an exe­cutable JUnit test that runs JBe­have sto­ries. (We also have to over­ride some of JBehave’s frankly ter­ri­ble report­ing defaults!) It utilises the Spring test­ing frame­work and the test con­text described above to allow autowiring of JBe­have “steps” (which are just POJOs anno­ta­tion with @Given, @When, and @Then meth­ods that are matched against the cor­re­spond­ing lines from a story file. I cre­ated a cus­tom Spring com­po­nent stereo­type called @Steps:

@Target({ElementType.TYPE})
@Retention(RetentionPolicy.RUNTIME)
@Component
public @interface Steps {
    String value() default "";
}

All Spring beans anno­tated with the @Steps anno­ta­tion are pre­sented to JBe­have as can­di­date steps for the exe­cu­tion of a story file which is spec­i­fied by a con­crete sub­class. At present, there is no clever fil­ter­ing of can­di­date steps and all story files are assumed at the very least to live within the bdd.org.apposite name­space. You could prob­a­bly be much more sophis­ti­cated if nec­es­sary, but this has proved suf­fi­cient for my require­ments so far.

Writ­ing the tests

Now we are in a posi­tion to write a basic story file. Let’s start with a “secu­rity” fea­ture, for exam­ple: “I should not be able to access the admin­is­tra­tion dash­board unless I am logged in and have suf­fi­cient priv­i­leges”. Whilst per­haps not the best exam­ple of a “func­tional area” (because “secu­rity” is in real­ity an con­stituent com­po­nent of other func­tional areas and cuts across them) it will nonethe­less suf­fice for now:

Scenario: I cannot access the administration dashboard unless I am logged in

Given I am not logged in
When I go to the administration dashboard
Then I am asked to login
Then I enter the administrator credentials
Then I am redirected to the administration dashboard

This is an extremely basic exam­ple of a story file. For a full descrip­tion of all the pos­si­ble fea­tures of story files in JBe­have, check out their doc­u­men­ta­tion (which, whilst it appears com­pre­hen­sive, does not win any prizes for clar­ity). Noneth­less, I’m sure you get the gist: each sce­nario con­sists of some pre­con­di­tions, an oper­a­tion, and some post­con­di­tions. You can have as many sce­nar­ios per “story file” as you like, sep­a­rated by scenario dec­la­ra­tions (this is why I opted to use the *.stories exten­sion rather than the more com­mon *.story — the for­mer made more gram­mat­i­cal sense to me).

Next, because I opted for a one-to-one rela­tion between story files and exe­cutable tests, you need to cre­ate a very sim­ple class that will indi­cate that this story file needs to be run:

public class SecurityFunctionalTest extends AbstractFunctionalTest {

    public SecurityFunctionalTest() {
        super("security.stories");
    }

}

Bingo! We have a behaviour-driven func­tional test. “But where the hell is all the logic?” I am sure you are ask­ing (if you are, it is, of course, the right ques­tion … and I shall move onto that now).

Organ­is­ing respon­si­bil­i­ties into page objects

The good peo­ple behind Web­Driver (and almost any­one else who has done this kind of test­ing) rightly rec­om­mend that you take the impor­tant step of describ­ing your web appli­ca­tion in terms of “page objects”. A page object should, loosely speak­ing, cor­re­spond to the response from a given URI and encap­su­late the “ser­vices” (forms, links, key infor­ma­tion) that it pro­vides. In the test case above, I am inter­ested in two pages:

  1. A “logout” page (this is the least obvi­ous but bear in mind that we need to encap­su­late access to a URI that will ensure that we are not logged in to com­plete the first step)
  2. The “admin­is­tra­tion dash­board” page
  3. The “login” page

Clearly, there are going to be many things that are com­mon to all pages (even if you have a very inco­her­ent user inter­face). For exam­ple, at the very least, you should be able to “visit” all pages. Also, for test­ing pur­poses, we should be able to assert what page we are cur­rently on. There­fore, I will cut to the chase and begin with an abstract super­class for them all that can be used to describe these com­mon features:

public abstract class AbstractPage {

    private static final int DEFAULT_PORT = 8080;

    private String url;

    private WebDriver web;

    protected AbstractPage(String uri, WebDriver web) {
        this.web = web;
        int port = System.getProperty("jetty.port") != null ? Integer.valueOf(System.getProperty("jetty.port")) : DEFAULT_PORT;
        url = "http://localhost:" + port + "/apposite" + uri;
    }

    public void assertIsCurrentPage() {
        assertThat(isCurrentPage(), is(true));
    }

    public abstract boolean isCurrentPage();

    public final void visit() {
        web.get(url);
    }

There is a bug in maven-surefire-plugin < 2.6 which will pre­vent the sys­tem prop­erty being avail­able to the test here, so will have to hard-code the port on which you run your func­tional tests or upgrade. See SUREFIRE-121.

Our abstract page is respon­si­ble for man­ag­ing the Web­Driver instance (which it requires for instan­ti­a­tion), and coor­di­nat­ing what port, host, and con­text path we are run­ning on (the lat­ter two hard­coded here for the sake of sim­plic­ity but eas­ily exter­nal­is­able through sys­tem prop­er­ties if nec­es­sary). This means that con­crete page instances only need to spec­ify what URI they have (with­out hav­ing to con­sider con­text paths etc) and the abstract super­class can per­form all the actual “get­ting” and so forth.

Next, we define our actual, con­crete pages:

public class AdministrationDashboardPage extends AbstractPage {

    @FindBy(css = "h2") private WebElement heading;

    public AdministrationDashboardPage(WebDriver web) {
        super("/admin", web);
    }

    @Override public boolean isCurrentPage() {
        return "Administration Dashboard".equals(heading.getText());
    }

}

This first page def­i­n­i­tion is extremely sim­ple. Using the “finder” anno­ta­tions pro­vided by the selenium-support arte­fact, we use the pres­ence of some head­ing text to deter­mine whether or not we are actu­ally on the admin dash­board page. I will come back to the @FindBy anno­ta­tion in a lit­tle while.

public class LogoutPage extends AbstractPage {

    public LogoutPage(WebDriver web) {
        super("/users/logout", web);
    }

    @Override public boolean isCurrentPage() {
        return false;
    }

}

The logout page is even more sim­ple because we should only ever need to “visit” it and we should never actu­ally be “on it”. It is just a URI.

public class LoginPage extends AbstractPage {

    @FindBy(css = "#j_username") private WebElement username;

    @FindBy(css = "#j_password") private WebElement password;

    public LoginPage(WebDriver web) {
        super("/users/login", web);
    }

    public void enterUsername(String username) {
        this.username.sendKeys(username);
    }

    public void enterPassword(String password) {
        this.password.sendKeys(password);
    }

    public void login() {
        password.submit();
    }

    @Override public boolean isCurrentPage() {
        return username != null && password != null;
    }

}

The login page shows a lit­tle more about how page objects are intended to be used in that it encap­su­lates access to crit­i­cal form ele­ments (again, injected using the Web­Driver anno­ta­tions) in what can be viewed as “ser­vice meth­ods”. This means that any change in the page should only ever require an update to one code loca­tion. How­ever, it does also place con­sid­er­able impor­tance on being able to accu­rately iden­tify what “a page” really is in your appli­ca­tion (this becomes more com­plex when you are deal­ing with asyn­chro­nous JavaScript mak­ing calls to HTTP “ser­vices” from within a supposed-page — these ser­vices are, in effect, pages them­selves even if the human end-user never sees them in-the-raw — so keep an open mind about the def­i­n­i­tion of a page there!)

Ora­gan­is­ing logic into steps objects

Finally, we need to cre­ate imple­men­ta­tions for the “steps” which are detailed in our “story file” (one of the nice fea­tures of JBe­have is that you can tell it not to fail tests where imple­men­ta­tions have not yet been done — mark­ing these as “pend­ing” in the reports — this way one bunch of peo­ple can get busy writ­ing sto­ries which do not have to be com­mit­ted at the same time as the imple­men­ta­tions, open­ing the pos­si­bil­ity of these two tasks being com­pleted by sep­a­rate groups; e.g. “busi­ness ana­lysts” on the one hand and soft­ware devel­op­ers on the other).

Again, there are cer­tainly going to be a num­ber of com­mons aspects to these steps objects, so you can begin with an abstract class. Mine cur­rently looks like this:

@Scope(BeanDefinition.SCOPE_PROTOTYPE) public abstract class AbstractSteps {

    protected final WebDriver web;

    protected AbstractSteps(WebDriver web) {
        this.web = web;
    }

    @AfterScenario public void closeWebDriver() {
        web.close();
    }

    protected final <T extends AbstractPage> T initPage(Class<T> pageClass) {
        return PageFactory.initElements(web, pageClass);
    }

}

Notice that, because my steps are going to be man­aged by the functional-tests.xml Spring con­text, I can define them as pro­to­type using @Scope so that we get a fresh instance wher­ever it is required with a cor­re­spond­ingly fresh instance of the pro­to­type Web­Driver bean, which is autowired in. I define only one com­mon method at present, which pro­vides a strongly-typed means to instan­ti­ate a Web­Driver page object (to use the @FindBy Web­Driver anno­ta­tions, you have to instan­ti­ate the page using the PageFactory.initElements(WebDriver, Object) method). Addi­tion­ally, I make absolutely sure that the fire­fox win­dow gets closed by using a JBe­have @AfterScenario method to close it (this is almost directly equiv­a­lent to JUnit’s @After anno­ta­tion). Now I am in a good posi­tion to start writ­ing steps classes that shouldn’t need to worry about too much extra­ne­ous fluff. Let’s take a look at the SecuritySteps that I wrote to imple­ment my basic story file described above:

@Steps public class SecuritySteps extends AbstractSteps {

    private String username;

    private String password;

    @Autowired public SecuritySteps(WebDriver web) {
        super(web);
    }

    @Given("I am not logged in")
    public void iAmNotLoggedIn() {
        initPage(LogoutPage.class).visit();
    }

    @When("I go to the administration dashboard")
    public void iGoToTheAdministrationDashboard() {
        initPage(AdministrationDashboardPage.class).visit();
    }

    @Then("I am asked to login")
    public void iAmAskedToLogin() {
        initPage(LoginPage.class).assertIsCurrentPage();
    }

    @Then("I enter the administrator credentials")
    public void iEnterTheAdministratorCredentials() {
        LoginPage p = initPage(LoginPage.class);
        p.enterUsername(username);
        p.enterPassword(password);
        p.login();
    }

    @Then("I am redirected to the administration dashboard")
    public void iAmRedirectedToTheAdministrationDashboard() {
        initPage(AdministrationDashboardPage.class).assertIsCurrentPage();
    }

    @Autowired public void setUsername(@Value("${apposite.security.root.user.name}") String username) {
        this.username = username;
    }

    @Autowired public void setPassword(@Value("${apposite.security.root.user.password}") String password) {
        this.password = password;
    }

}

Some things to notice about this class:

  1. It is anno­tated with my cus­tom @Steps anno­ta­tion so that the AbstractFunctionalTest will pick it up from the appli­ca­tion con­text and present it as can­di­date steps for the story file.
  2. Each method (exclud­ing the set­ters) cor­re­sponds to one of the state­ments from the story file: they are matched on the anno­ta­tion text. There are far more funky things one can do in terms of para­me­ter injec­tion here — this is just the most basic exam­ple possible.
  3. Because I decided to use Spring to man­age my steps classes, I am able not only to autowire any Web­Driver instance I choose to use, but also con­fig­u­ra­tion val­ues from the appli­ca­tion or envi­ron­ment prop­er­ties files (using @Value annotations)

Con­clu­sion

All of this is might seem like quite a lot of code to achieve the sim­ple goal of run­ning in-container func­tional tests. Nat­u­rally, there are ways of doing it with less code. How­ever, most of the above is sim­ply infra­struc­ture which can be extracted and shared between mul­ti­ple projects and is intended to pro­vide a har­ness that min­imises the amount of work required to add test cases, which, being the most numer­ous “code type” should ide­ally require the least amount of actual code indi­vid­u­ally. Both steps and page objects might fea­si­bly con­sid­ered as a form of “shared library”: re-use should most def­i­nitely be a goal and, once you begin to get a more com­pre­hen­sive col­lec­tion, you should not be required to always write new page objects or step imple­men­ta­tions in order to write new test cases (but don’t move them out of the project until you absolutely have to!)

Con­se­quently, with a lit­tle up-front work and thought, it is pos­si­ble to reduce the costs of adding test cases over time. This is often cited as a goal. Sadly, it is too often the case in prac­tice that a some­what lack­adaisi­cal approach at the out­set leads to this aim being con­founded: test har­nesses become brit­tle, test code becomes more com­plex than the code it tests and has bugs, tests start to blink, the whole thing becomes dis­or­gan­ised and confused.

My aim through­out has been to bal­ance flex­i­bil­ity with reg­u­lar­ity and pre­dictabil­ity: make the con­ven­tions clear and repeat­able but not too rigid. From the per­spec­tive of the build cycle, one of the keys to achiev­ing that pre­dictabil­ity is mak­ing sure that devel­op­ers can always run these kinds of test when­ever they need to (reduc­ing the time to feed­back). Hav­ing a build infra­struc­ture such as the one described here helps devel­op­ers to avoid not only “break­ing the build” but also “break­ing the functionality”.

Clearly, there is code to write in order to actu­ally make this test pass — which I shall hope­fully come on to in sub­se­quent posts — but that is one of the really nice things about BDD and func­tional test­ing of this kind: if you are using an agile method­ol­ogy (which you should be), then your JBe­have sto­ries, whilst not absolutely equiv­a­lent, should bear a close rela­tion to your iter­a­tion story cards. From this per­spec­tive, they can serve as a “def­i­n­i­tion of done” for any given ver­ti­cal slice of func­tion­al­ity: you can pro­ceed using TDD at a unit test level — the tests will fail & fail & fail — but, then, once all the var­i­ous units are com­plete, your end-to-end test will pass and you know you are done … or, I should say, you know you are “good enough” because you shouldn’t for­get to do a lit­tle refac­tor­ing and code-tidying before you con­sider it truly complete.