Why can’t you just communicate properly?

Online communication bugs me. Actually, bugs isn’t accurate. Maybe saddens and fatigues. When volleying with people hiding behind their keyboard shield and protected by three timezones, you have to make a conscious effort to remain optimistic. It’s part of the reason I haven’t taken to Twitter as much as I probably should.

I’ve talked on this subject before and it’s something I often have in the back of my mind when reading comments. It’s come to the forefront recently with some conversations we’ve had at Western Devs, which led to our most recent podcast. I wasn’t able to attend so here I am.

There are certain phrases you see in comments that automatically seem to devolve a discussion. They include:

  • “Why don’t you just…”
  • “Sorry but…”
  • “Can’t you just…”
  • “It’s amazing that…”

Ultimately, all of these phrases can be summarized as follows:

I’m better than you and here’s why…

In my younger years, I could laugh this off amiably and say “Oh this wacky world we live in”. But I’m turning 44 in a couple of days and it’s time to start practicing my crotchety, even if it means complaining about people being crotchety.

So to that end: I’m asking, nay, begging you to avoid these and similar phrases. This is for your benefit as much as the reader’s. These phrases don’t make you sound smart. Once you use them, it’s very unlikely anyone involved will feel better about themselves, let alone engage in any form of meaningful discussion. Even if you have a valid point, who wants to be talked down to like that? Have you completely forgot what it’s like to learn?

“For fuck’s sake, Mom, why don’t you just type the terms you want to search for in the address bar instead of typing WWW.GOOGLE.COM into Bing?”

Now I know (from experience) it’s hard to fight one’s innate sense of superiority and the overwhelming desire to make it rain down on the unwashed heathen. So take it in steps. After typing your comment, remove all instances of “just” (except when just means “recently” or “fair”, of course). The same probably goes for “simply”. It has more of a condescending tone than a dismissive one. “Actually” is borderline. Rule of thumb: Don’t start a sentence with it.

Once you have that little nervous tic under control, it’s time to remove the negatives. Here’s a handy replacement guide to get you started:

Original phrase Replacement
“Can’t you” “Can you”
“Why don’t you” “Can you”
“Sorry but” no replacement; delete the phrase
“It’s amazing that…” delete your entire comment and have a dandelion break

See the difference? Instead of saying Sweet Zombie Jayzus, you must be the stupidest person on the planet for doing it this way, you’ve changed the tone to Have you considered this alternative? In both instances, you’ve made your superior knowledge known but in the second, it’s more likely to get acknowledged. More importantly, you’re less likely to look like an idiot when the response is: I did consider that avenue and here are legitimate reasons why I decided to go a different route.

To be fair, sometimes the author of the work you’re commenting on needs to be knocked down a peg or two themselves. I have yet to meet one of these people who respond well to constructive criticism critique, let alone the destructive type I’m talking about here. Generally, I find they feel the need to cultivate an antagonistic personality but in my experience, they usually don’t have the black turtlenecks to pull it off. Usually, it ends up backfiring and their dismissive comments become too easy to dismiss over time.

Kyle the Inclusive

Originally posted to: http://www.westerndevs.com/communication/Why-can-t-you-just/
Posted in Sundry | Tagged | 12 Comments

Chocolatey Community Feed Update!

Average approval time for moderated packages is currently under 10 hours!

In my last post, I talked about things we were implementing or getting ready to implement to really help out with the process of moderation.  Those things are:

  • The validator – checks the quality of the package
  • The verifier – tests the package install/uninstall and provides logs
  • The cleaner – provides reminders and closes packages under review when they have gone stale.

The Cleanup Service

We’ve created a cleanup service, known as the cleaner that went into production recently.

  • It looks for packages under review that have gone stale – defined as 20 or more days since last review and no progress
  • Sends a notice/reminder that the package is waiting for the maintainer to fix something and that if another 15 days goes by with no progress, the package will automatically be rejected.
  • 15 days later if no progress is made, it automatically rejects packages with a nice message about how to pick things back up later when the maintainer is ready.

Current Backlog

We’ve found that with all of this automation in place, the moderation backlog was quickly reduced and will continue to be manageable.

A visual comparison:

12/18/2015 - 1630 packages ready for a moderator

December 18, 2015 – 1630 packages ready

 

01/01/2016 - 7 packages ready for a moderator

January 16, 2016 – 7 packages ready

Note the improvements all around! The most important numbers to key in on are the first 3, they represent a waiting for reviewer to do something status. With the validator and verifier in place, moderation is much faster and more accurate, and the validator has increased package quality all around with its review!

The waiting for maintainer (927 in the picture above) represents the bulk of the total number of packages under moderation currently. These are packages that require an action on the part of the maintainer to actively move the package to approved. This is also where the clean up service comes in.

The cleaner sent 800+ reminders two days ago. If there is no response by early February on those packages, the waiting for maintainer status will drop significantly as those packages will automatically be rejected. Some of those packages have been waiting for maintainer action for over a year and are likely abandoned. If you are a maintainer and you have not been getting emails from the site, you should log in now and make sure your email address is receiving emails and that the messages are not going to your spam folder. A rejected package version is reversible, the moderators can put it back to submitted at any time when a maintainer is ready to work on moving the package towards approval again.

Statistics

This is where it really starts to get exciting.

Some statistics:

  • Around 30 minutes after a package is submitted the validator runs.
  • Within 1-2 hours the verifier has finished testing the package and posts results.
  • Typical human review wait time after a package is deemed good is less than a day now.

We’re starting to build statistics on average time to approval for packages that go through moderation that will be visible on the site.  Running some statistics by hand, we’ve approved 236 packages that have been created since January 1st, the average final good package (meaning that it was the last time someone submitted fixes to the package) to approval time has been 15 hours. There are some packages that drove that up due to fixing some things in our verifier and rerunning the tests. If I change to only looking at packages since those fixes have went in on the 10th, that is 104 packages with an average approval within 7 hours!

Posted in chocolatey | Tagged , | 4 Comments

Migrating from Jekyll to Hexo

WesternDevs has a shiny new look thanks to graphic designer extraodinaire, Karen Chudobiak. When implementing the design, we also decided to switch from Jekyll to Hexo. Besides having the opportunity to learn NodeJS, the other main reason was Windows. Most of us use it as our primary machine and Jekyll doesn’t officially support it. There are instructions available by people who were obviously more successful at it than we were. And there are even simpler ones that I discovered during the course of writing this post and that I wish existed three months ago.

Regardless, here we are and it’s already been a positive move overall, not least because the move to Node means more of us are available to help with the maintenance of the site. But it wasn’t without it’s challenges. So I’m going to outline the major ones we faced here in the hopes that it will help you make your decision more informed than ours was.

To preface this, note that I’m new to Node and in fact, this is my first real project with it. That said, I’m no expert in Ruby either, which is what Jekyll is written in. And the short version of my first impressions is: Jekyll feels more like a real product but I had an easier time customizing Hexo once I dug into it. Here’s the longer version

DOCUMENTATION/RESOURCES

You’ll run into this very quickly. Documentation for Hexo is decent but incomplete. And once you start Googling, you’ll discover many of the resources are in Chinese. I found very quickly that there isposts collection and that each post has a categories collection. But as to what these objects look like, I couldn’t tell. They aren’t arrays. And you can’t JSON.stringify them because they have circular references in them. util.inspect works but it’s not available everywhere.

MULTI-AUTHOR SUPPORT

By default, Hexo doesn’t support multiple authors. Neither does Jekyll, mind you, but we found apretty complete theme that does. In Hexo, there’s a decent package that gets you partway there. It lets you specify an author ID on a post and it will attach a bunch of information to it. But you can’t, for example, get a full list of authors to list on a Who We Are page. So we created a separate data file for the authors. But we also haven’t figured out how to use that file to generate a .json file to use for the Featured section on the home page. So at the moment, we have author information in three places. Our temporary solution is to disallow anyone from joining or leaving Western Devs.

CUSTOMIZATION

If you go with Hexo and choose an existing themes, you won’t run into the same issues we did. Out of the box, it has good support for posts, categories, pagination, even things like tags and aliases with the right plugins.

But we started from a design and were migrating from an existing site with existing URLs and had to make it work. I’ve mentioned the challenge of multiple authors already. Another one: maintaining our URLs. Most of our posts aren’t categorized. In Jekyll, that means they show up at the root of the site. In Hexo, that’s not possible. At least at the moment and I suspect this is a bug. We eventually had to fork Hexo itself to maintain our existing URLs.

Another challenge: excerpts. In Jekyll, excerpts work like this: Check the front matter for an excerpt. If one doesn’t exist, take the first few characters from the post. In Hexo, excerpts are empty by default. If you add a <!--more--> tag in your post, everything before that is considered an excerpt. If you specify an excerpt in your front matter, it’s ignored because there is already an excerptproperty on your posts.

Luckily, there’s a plugin to address the last point. But it still didn’t address the issue of all our posts without an excerpt where we relied solely on the contents of the post.

So if you’re looking to veer from the scripted path, be prepared. More on this later in the “good parts” section.

OVERALL FEELING OF RAWNESS

This is more a culmination of the previous issues. It just feels like Hexo is a work-in-progress whereas Jekyll feels more like a finished product. There’s a strong community behind Jekyll and plenty of help. Hexo still has bugs that suggest it’s just not used much in the wild. Like rendering headers with links in them. It makes the learning process a bit challenging because with Jekyll, if something didn’t work, I’d think I’m obviously doing something wrong. With Hexo, it’s I might be doing something wrong or there might be a bug.


THE GOOD PARTS

I said earlier that the move to Hexo was positive overall and not just because I’m optimistic by nature. There are two key benefits we’ve gained just in the last two weeks.

GENERATION TIME

Hexo is fast, plain and simple. Our old Jekyll site took six seconds to generate. Doesn’t sound like much but when you’re working on a feature or tweaking a post, then saving, then refreshing, then rinsing, then repeating, that six seconds adds up fast. In Hexo, a full site generation takes three seconds. But more importantly, it is smart enough to do incremental updates while you’re working on it. So if you run hexo server, then see a mistake in your post, you can save it, and the change will be reflected almost instantly. In fact, it’s usually done by the time I’ve switched back to the browser.

CONTRIBUTORS

We had logistical challenges with Jekyll. To the point where we had two methods for Windows users that wanted to contribute (i.e. add a post). One involved a Docker image and the other Azure ARM. Neither was ideal as they took between seconds and minutes to refresh if you made changes. Granted, both methods furthered our collective knowledge in both Docker and Azure but they both kinda sucked for productivity.

That meant that realistically, only the Mac users really contributed to the maintenance of the site. And our Docker/Azure ARM processes were largely ignored as we would generally just test in production. I.e. create a post, check it in, wait for the site to deploy, make necessary changes, etc, etc.

With the switch to Hexo, we’ve had no fewer than five contributors to the site’s maintenance already. Hexo just works on Windows. And on Mac. Best of both worlds.

CUSTOMIZATION

This is listed under the challenges but ever the optimist, I’m including it here as well. We’ve had to make some customizations for our site, including forking Hexo itself. And for me personally, once I got past the why isn’t this working the way I want? stage, it’s been a ton of fun. It’s crazy simple to muck around in the node modules to try stuff out. And just as simple to fork something and reference it in your project when the need arises. I mentioned an earlier issue rendering links in headers. No problem, we just swapped out the markdown renderer for another one. And if that doesn’t work, we’ll tweak something until it does.


I want to talk more on specific conversion issues we ran into as a guide to those following in our footsteps. But there are enough of them to warrant a follow up post without all this pre-amble. For now, we’re all feeling the love for Hexo. So much so that no less than three other Western Devs are in the process of converting their personal blogs to it.

Originally posted to: http://www.westerndevs.com/jekyll/hexo/Migrating-from-Jekyll-to-Hexo/
Posted in Uncategorized | Tagged , , | Leave a comment

Chocolatey Community Feed State of the Union

tl;dr: Everything on https://chocolatey.org/notice is coming to fruition! We’ve automatically tested over 6,500 packages, a validator service is coming up now to check quality and the unreviewed backlog has been reduced by 1,000 packages! We sincerely hope that the current maintainers who have been waiting weeks and months to get something reviewed can be understanding that we’ve dug ourselves into a moderation mess and are currently finding our way out of this situation.

Notice on Chocolatey.org
We’ve added a few things to Chocolatey.org (the community feed) to help speed up review times for package maintainers. A little over a year ago we introduced moderation for all new package versions (besides trusted packages) and from the user perspective it has been a fantastic addition. The usage has went up by over 20 million packages installed in one year versus just 5 million the 3 years before it! It’s been an overwhelming response for the user community. Let me say that again for effect: Chocolatey’s usage of community packages has increased 400% in one year over the prior three years combined!

But let’s be honest, we’ve nearly failed in another area. Keeping the moderation backlog low. We introduced moderation as a security measure for Chocolatey’s community feed because it was necessary, but we introduced it too early. We didn’t have the infrastructure automation in place to handle the sheer load of packages that were suddenly thrown at us. And once we put moderation in place, more folks wanted to use Chocolatey so it suddenly became much more popular. And because we have automation surrounding updating and pushing packages (namely automatic packages), we had some folks who would submit 50+ packages at a time. With one particular maintainer submitting 200 packages automatically, and a review of each of them taking somewhere between 2-10 minutes, you don’t have to be a detective to understand how this is going to become a consternation. And from the backlog you can see it really hasn’t worked out well.

1597 submitted

The most important number to understand here is the number in the submitted (underlined). This is the number of packages where a moderator has not yet looked at a package. A goal is to keep this well under 100. We want that time from a high quality package getting submitted to approved within 1-2 days.

Moderation has up until recently been a very manual process. Sometimes depending on which moderator that looked at your package determined whether it was going to be held in review for various reasons. We’ve added moderators and we’ve added more guidance around moderation to help bring a more structured review process. But it’s not enough.

Some of you may not know this, but our moderators are volunteers and we currently lack full-time employees to help fix many of the underlying issues. Even considering that we’ve also needed to work towards Kickstarter delivery and the Chocolatey rewrite (making choco better for the long term), it’s still not the greatest news to know that it has taken a long time to fix moderation, but hopefully it brings some understanding. Our goal is to eventually bring on full-time employees but we are not there yet. The Kickstarter was a start, but it was just that. A kick start. A few members of the core team who are also moderators have focused on ensuring the Kickstarter turns into a model that can ensure the longevity of Chocolatey. It may have felt that we have been ignoring the needs of the community, but that has not been our intention at all. It’s just been really busy and we needed to address multiple areas surrounding Chocolatey with a small number of volunteers.

So What Have We Fixed?

All moderation review communication is done on the package page. Now all review is done on the website, which means that there is no email back and forth (the older process) and what looks like one-sided communication on the site. This is a significant improvement.

Package review logging. Now you can see right from the discussion when and who submits package, when statuses change and where the conversation is.

package review logging

More moderators. A question that comes up quite a bit surrounds the number of moderators that we have and adding more. We have added more moderators. We are up to 12 moderators for the site. Moderators are chosen based on building trust, usually through being extremely familiar with Chocolatey packaging and what is expected of approved packages. Learning what is expected usually comes through having your own packages approved and having a few packages. We’ve written most of this up at https://github.com/chocolatey/choco/wiki/Moderation.

Maintainers can self-reject packages that no longer apply. Say your package has a download url for the software that is always the same. You have some older package versions that could take advantage of being purged out of the queue since they are no longer applicable.

The package validation service (the validator). The validator checks the quality of a package based on requirements, guidelines and suggestions for creating packages for Chocolatey’s community feed. Many of the validation items will automatically roll back into choco and will be displayed when packaging a package. We like to think of the validator as unit testing. It is validating that everything is as it should be and meets the minimum requirements for a package on the community feed.

validation results

The package verifier service (the verifier). The verifier checks the correctness (that the package actually works), that it installs and uninstalls correctly, has the right dependencies to ensure it is installed properly and can be installed silently. The verifier runs against both submitted packages and existing packages (checking every two weeks that a package can still install and sending notice when it fails). We like to think of the verifier as integration testing. It’s testing all the parts and ensuring everything is good. On the site, you can see the current status of a package based on a little colored ball next to the title. If the ball is green or red, the ball is a link to the results (only on the package page, not in the list screen).

passed verification - green colored ball with link

  • Green means good. The ball is a link to the results
  • Orange if still pending verification (has not yet run).
  • Red means it failed verification for some reason. The ball is a link to the results.
  • Grey means unknown or excluded from verification (if excluded, a reason will be listed on the package page).

Coming Soon – Moderators will be automatically be assigned to backlog items. Once a package passes both validation and verification, a moderator is automatically assigned to review the package. Once the backlog is in a manageable state, this will be added.

What About Maintainer Drift?

Many maintainers come in to help out at different times in their lives and they do it nearly always as volunteers. Sometimes it is the tools they are using at the current time and sometimes it has to do with where they work. Over time folks’ preferences/workplaces change and so maintainers drift away from keeping packages up to date because they have no internal incentive to continue to maintain those packages. It’s a natural human response. I’ve been thinking about ways to reduce maintainer drift for the last three years and I keep coming back to the idea that consumers of those packages could come along and provide a one time or weekly tip to the maintainer(s) as a thank you for keeping package(s) updated. We are talking to Gratipay now – https://github.com/gratipay/inside.gratipay.com/issues/441 This, in addition to a reputation system, I feel will go a long way to help reduce maintainer drift.

Final Thoughts

Package moderation review time is down to mere seconds as opposed to minutes like before. This will allow a moderator to review and approve package versions much more quickly and will reduce our backlog and keep it lower.

It’s already working! The number in the unreviewed backlog are down by 1,000 from the month prior. This is because a moderator doesn’t have to wait until a proper time when they can have a machine up and ready for testing and in the right state. Now packages can be reviewed faster. This is only with the verifier in place, sheerly testing package installs. The validator expects to cut that down to near seconds of review time. The total number of packages in the moderation backlog have also been reduced, but honestly I only usually pay attention to the unreviewed backlog number as it is the most important metric for me.

The verifier has rolled through over 6,500 verifications to date! https://gist.github.com/choco-bot/

When chocobot hit 6500 packages verified

We sincerely hope that the current maintainers who have been waiting weeks and months to get something reviewed can be understanding that we’ve dug ourselves into a moderation mess and are currently finding our way out of this situation. We may have some required findings and will ask for those things to be fixed, but for anything that doesn’t have required findings, we will approve them as we get to them.

Posted in chocolatey | 1 Comment

Testing with Data

It’s not a coincidence that this is coming off the heels of Dave Paquette’s post on GenFu and Simon Timms’ post on source control for databases in the same way it was probably not a coincidence that Hollywood released three body-swapping movies in the 1987-1988 period (four if you include Big).

I was asked recently for some advice on generating data for use with integration and UI tests. I already have some ideas but asked the rest of the Western Devs for some elucidation. My tl;dr version is the same as what I mentioned in our discussion on UI testing: it’s hard. But manageable. Probably.

The solution needs to balance a few factors:

  • Each test must start from a predictable state
  • Creating that predictable state should be fast as possible
  • Developers should be able to figure out what is going on by reading the test

The two options we discussed both assume the first factor to be immutable. That means you either clean up after yourself when the test is finished or you wipe out the database and start from scratch with each test. Cleaning up after yourself might be faster but has more moving parts. Cleaning up might mean different things depending on which step you’re in if the test fails.

So given that we will likely re-create the database from scratch before each and every test, there are two options. My current favourite solution is a hybrid of the two.

Maintain a database of known data

In this option, you have a pre-configured database. Maybe it’s a SQL Server .bak file that you restore before each test. Maybe it’s a GenerateDatabase method that you execute. I’ve done the latter on a Google App Engine project, and it works reasonably well from an implementation perspective. We had a class for each domain aggregate and used dependency injection. So adding a new test customer to accommodate a new scenario was fairly simple. There are a number of other ways you can do it, some of which Simon touched on in his post.

We also had it set up so that we could create only the customer we needed for that particular test if we needed to. That way, we could use a step likeGiven I'm logged into 'Christmas Town' and it would set up only that data.

There are some drawbacks to this approach. You still need to create a new class for a new customer if you need to do something out of the ordinary. And if you need to do something only slightly out of the ordinary, there’s a strong tendency to use an existing customer and tweak its data ever so slightly to fit your test’s needs, other tests be damned. With these tests falling firmly in the long-running category, you don’t always find out the effects of this until much later.

Another drawback: it’s not obvious in the test exactly what data you need for that specific test. You can accommodate this somewhat just with a naming convention. For example,Given I'm logged into a company from India, if you’re testing how the app works with rupees. But that’s not always practical. Which leads us to the second option.

Create an API to set up the data the way you want

Here, your API contains steps to fully configure your database exactly the way you want. For example:

Given I have a company named "Christmas Town" owned by "Jack Skellington"
And I have 5 product categories
And I have 30 products
And I have a customer
...

 

You can probably see the major drawback already. This can become very verbose. But on the other hand, you have the advantage of seeing exactly what data is included which is helpful when debugging. If your test data is wrong, you don’t need to go mucking about in your source code to fix it. Just update the test and you’re done.

Also note the lack of specifics in the steps. Whenever possible, I like to be very vague when setting up my test data. If you have a good framework for generating test data, this isn’t hard to do. And it helps uncover issues you may not account for using hard-coded data (as anyone named D’Arcy O’Toole can probably tell you).


Loading up your data with a granular API isn’t realistic which is why I like the hybrid solution. By default, you pre-load your database with some common data, like lookup tables with lists of countries, currencies, product categories, etc. Stuff that needs to be in place for the majority of your tests.

After that, your API doesn’t need to be that granular. You can use something likeGiven I have a basic company which will create the company, add an owner and maybe some products and use that to test the process for creating an order. Under the hood, it will probably use the specific steps.

One reason I like this approach: it hides only the details you don’t care about. When you sayGiven I have a basic company and I change the name to "Rick's Place", that tells me, “I don’t care how the company is set up but the company name is important”. Very useful to help narrow the focus of the test when you’re reading it.

This approach will understandably lead to a whole bunch of different methods for creating data of various sizes and coarseness. And for that you’ll need to…

Maintain test data

Regardless of your method, maintaining your test data will require constant vigilance. In my experience, there is a tremendous urge to take shortcuts when it comes to test data. You’ll re-use a test company that doesn’t quite fit your scenario. You’ll alter your test to fit the data rather than the other way around. You’ll duplicate a data setup step because your API isn’t discoverable.

Make no mistake, maintaining test data is work. It should be treated with the same respect and care as the rest of your code. Possibly more so since the underlying code (in whatever form it takes) technically won’t be tested. Shortcuts and bad practices should not be tolerated and let go because “it’s just test data”. Fight the urge to let things slide. Call it out as soon as you see it. Refactor mercilessly once you see opportunities to do so.

Don’t be afraid to flip over a table or two to get your point across.

– Kyle the Unmaintainable

Posted in Uncategorized | 1 Comment