Naked code and naked data - Nuno Alexandre

Naked code and naked data are a somewhat discrete code smell that spread quickly and that decrease the quality of a project. Let’s see why and how we can avoid them through wrapped code.

Naked data

Naked data is data that is represented and passed using primitive types. Their domain is implicit or even hidden. This leads to three problems:

Poor type vocabulary
Low type safety
Bugs

Let’s jump to an example. Suppose you are given the following function:

runJob :: Int -> Int -> String -> String -> IO ()
runJob projectId companyId jobType priority = -- do something

Unless if you check the implementation of this function, you don’t know in which order to pass the arguments. Furthermore, projectId and companyId are actually just Ints, meaning they are seen as the same by readers and by the compiler.

This program presents a very low vocabulary and opens up a whole world of errors such as:

let projectId = 1
let companyId = 2
let jobType = "download"
let priority = "normal"

-- wrong order
runJob companyId projectId priority jobType

let otherCompanyId = 1

-- compiles and evaluates to True
companyId == projectId

You might be thinking: “You can cover these cases through exhaustive testing”. That might help a little, but it’s way better to make it right by design.

This example above could be made right by design by wrapping data in its own domain:

newtype ProjectId = ProjectId Int
newtype CompanyId = CompanyId Int
data JobType = Download | Upload | Save
data Priority = Low | Normal | High

runJob :: ProjectId -> CompanyId -> JobType -> Priority

This way we have both a rich vocabulary and type safety, which avoids all the mistakes that were possible before. This sure adds a tiny overhead, but one thing I have been learning with my Team Lead Robert Kreuzer is that life is made of tradeoffs. And here the pros are much stronger than the cons.

Naked code

The concept of naked code is similar to the concept of naked data, but I’d say it’s slightly more subjective.

Naked code is code written and implemented without a domain. It has the potential to be something on its own, but, unfortunately, it’s not.

I have mainly three problems with naked code:

Harder to test
Hard to read
Impossible to reuse

An example of naked code would be:

-- |  Active companies with failed jobs should be alerted.
getCompaniesToAlert :: [CompanyId] -> [Job] -> [CompanyId]
getCompaniesToAlert activeCompanies allJobs =
  intersect activeCompanies (nub . map jobCompanyId . filter ((== Failed) . jobStatus) $ allJobs)

Good luck reading this, and, even worse, reading a system where code like this is the norm. For someone reading this, it’s unclear what this function is really about. I think of this kind of code like that kind of people who keep making parenthesis while telling a story and neither them neither you can understand where it is going.

Instead, give every piece of knowledge and behaviour its own place and let composition be declarative and objective:

getCompaniesToAlert :: [CompanyId] -> [Job] -> [CompanyId]
getCompaniesToAlert subscribedCompanies allJobs =
  intersect subscribedCompanies getCompaniesWithFailingJobs

getFailedJobs :: [Job] -> [Job]
getFailedJobs = filter ((== Failed) . jobStatus)

getCompaniesWithFailingJobs :: [Job] -> [CompanyId]
getCompaniesWithFailingJobs = nub . map jobCompanyId . getFailedJobs

Again, this adds a small overhead, but the pros are more than the cons since the code is now:

Testable
Declarative
Readable
Reusable

Salt & Olive oil

This approach must be taken with a pinch of salt (and olive oil). Measure well the trade-off you are making in your specific case and choose accordingly. Sometimes, breaking everything to its own function is not the best route and it’s important to be balanced in this case.