Tuesday, December 15, 2020

Unpacking the Allied Security Operations Group Forensic Report

 Alright, I can't help myself.  I hate disinformation.  I also hate ignorance.  More than anything, I'm not a fan of people spreading information they wish were true without doing even basic diligence to verify it.  So, of course, I can't stand anything that is being said about this election on either side of the issues.

I've been challenging many people lately to provide some actual evidence of election fraud from an actual first hand source, and I finally found something.  This is a forensic report performed on some of the voting hardware used in Michigan.  Below, is my analysis of the report.

Summary

For those who don't want to unpack this whole thing, here is my high level take away.
  1. The company who produced the report would never get my business.  The report itself is poorly written, makes statements and assumptions that are inappropriate in a forensic analysis, and repeat many demonstrably false claims that are floating around the internet.  They should be embarrassed.
  2. The report identifies a few things that I find extremely concerning.  It is enough that I would whole heartedly support the FBI, or another government agency, obtaining forensic images from across Michigan to perform a much deeper analysis.  I think there could be reasonable answers to much of this, but there could also be very troubling answers as well.
  3. The report also reveals some things that I consider very sad, but totally expected and not actually nearly as concerning as they might first appear.
Long story short, there is evidence that suggests someone may have successfully hidden their tracks.  What they did is unclear.  However, it is crucial to note that the sort of fraud that *might* have happened in Antrim Michigan does not worry me.  And here is why.  It would require perpetrating such fraud at the precinct level.  This means that even flipping a single state would require hundreds of well trained operatives across hundreds of precincts.  Each would have had to successfully executed their role without being caught.  In each case, they would have needed access to log in to the machines in question.  And they would have to do so in a way that avoided a re-count, as a hand re-count, or even a machine based one, would uncover and unravel their efforts.  The logistical effort required is simply so immense that anyone capable of doing this would have been a fool.  There are just far easier ways to manipulate an election.

Holy Guacamole Batman

OK, so first with the damning items.

Mismatched Vote counts

Item B.3 and D.8-10 outline situations in which the votes in Central Lake Township were tabulated twice (or three times.  it's not totally clear).  In short, the different counting runs came up with very different results.  As I understand it, this has been explained by the state as being the result of a configuration problem with the machines.  That explanation is fine and well, but should be verified and validated though independent analysis.  As ASOG explains, it is not at all clear that this was the result of a mis-configuration.  It certainly could have been, but I think we deserve to understand that in much more detail than has been provided.  ASOG's analysts clearly could not find a "configuration" change that explained what they saw in the results.  I don't have nearly the information they do, but from what I see in their report, I can't detect any obvious pattern such as all republican votes going to democrats and vice versa.  Below is the most glaring example:

Configuration Changes

Speaking of mis-configuration;  ASOG goes into extensive detail about the requirement that voting machines and configurations be certified and "frozen" 90 days prior to the election.  I know nothing about these rules, so I'll take their word for it.  As such, even the changes that the state has admitted were made are potentially problematic.  Perhaps ASOG's interpretation of the law is lacking?  Even if that is the case, a clear explanation of exactly what was wrong with the configuration and how it impacted things is not at all unreasonable in this situation.  

Moreover, as ASOG makes clear, there are testing protocols in place.  I completely agree that it is very odd that configuration changes would have been made that were not thoroughly tested.  If I understand the statement from the secretary of state correctly, they are claiming that the configuration only impacted the "unofficial" counts that were shared with the press.  The official counts were not misconfigured in the first place.  Here, again, if that's the case we simply deserve a much deeper explanation so that any concerns to the contrary can be put to rest.  This could, potentially, explain why tests were successful if the testing was only concerned with the official vote counts.

Missing Audit and Security Logs

Alright, here is the big one.  Two critical types of logs were missing from the systems;  First, the adjudication logs, which I believe show changes made by election workers to ballot counts, and second, the system security logs, which would show who logged in and deleted the adjudication logs.  This is a really big deal.  It opens up a very real possibility that someone "flipped" votes and then covered their tracks.  The adjudication logs for prior years are still present, which makes the missing 2020 logs all the more suspicious.  Similarly, only a few days of security logs are missing.  

The purpose of the adjudication function is to allow election workers to fix ballots.  For example, if someone were to mark the wrong candidate, cross that out, and then mark the candidate they intended to vote for.  In that situation the system would flag the ballot and an adjudicator would "correct" the ballot.  While this feature has been decried as an intentional hole built to invite fraud, it is actually a critically important feature of any viable election system.

There are some major concerns posed online about how dominion handles adjudication.  I really can't comment as I have no experience with their systems.  If I were building such a function, I'd require at least two people to sign in first to ensure at least some oversight.  I would then ensure a clear log of all adjustments made and who made them.  This would all go into a tightly secured area of the system to ensure any fraud would easily be tracked back to the perpetrator.  No idea how much of that Dominion does, but given that whatever adjudication logs they have are missing in this case, it really doesn't matter.  This means we don't know how many ballots went to adjudication nor what happened to them once they got there.  One piece of good news - a hand recount of the ~1,500 ballots in Central Lake Township would easily answer the most important question here: was there fraud.  If the official vote counts match the re-count, then we're left with some really suspicious behavior that had no real impact.

Ballot Reversals

Directly from the report:

For examples, there were 1,222 ballots reversed out of 1,491 total ballots cast,
thus resulting in an 81.96% rejection rate. Some of which were reversed due to
"Ballot's size exceeds maximum expected ballot size".

I'm not sure what to say here.  This is strange, but without more information it's hard to piece together.  The report doesn't mention the election workers saying they had to adjudicate almost all of their ballots.  I would think this would have made more headlines and caused more immediate concern about the vote totals, but who knows.  Long story short, it's very concerning if it is an accurate statement, however I suspect this is a case where ASOG may have made some bad assumptions about what they were seeing in the log files.  Nonetheless, simply deposing the election workers and asking them would help to get to the bottom of this very quickly.

Error Rates

A review of the tabulation log found 10,677 distinct error entries.  Now, don't get me wrong, this is not unusual.  Your bank's software probably throws hundreds of error messages a day.  Very few companies can boast of a 0% error rate, and most of those that do are really only showing off how poor their error monitoring is.  Errors are part of software.  However, this is election software.  Next to software that can determine if someone lives or dies (think life support systems on aircraft), there are few applications that should have a higher standard than election systems.  This is a LOT of errors for a system that processed only ~1,500 ballots.  I'll talk later in this post about why I disagree with ASOG's analysis of these errors, but I do agree with the overall premise that this is a very strange result.  Perhaps it's related to the configuration issue that was later corrected?  If so, again, we deserve that in depth explanation.

Conclusion

I join with ASOG on this one when they state:
 
We recommend that an independent group should be empaneled to determine
the extent of the adjudication errors throughout the State of Michigan. This is a
national security issue.

What I see here is enough to warrant investigation.  Plain and simple.  I think there is a very real chance that there is nothing much going on, but elections are pretty sacred in my book.  So, yes, we should get to the bottom of these claims, and we should do so quickly.

Scary Like the Monster Under your Bed

Adjudication

As noted earlier, the majority of ASOG's concern with the Dominion system is that they believe it produced a large number of adjudicated ballots, and these are far more susceptible to fraud.  They even share this video showing how easy it is to use adjudication to commit fraud.  They're right, sort of.  I don't worry about adjudication fraud for two simple reasons.  First, it must happen at the precinct level.  To influence even a close state like Michigan, you would need to perpetrate this type of fraud broadly across hundreds or thousands of precincts.  In each one, you would need to find a willing participant.  That conspirator would have to avoid detection by the other workers in the precinct, including anyone who might provide oversight to the adjudication.  They'd have to keep their total impact small enough to avoid a recount due to suspicion, but large enough to actually impact the results.  And they'd have to do this all with the knowledge that a recount could expose them.  Finally, this would all have to done in a one-sided manner.  Lots of fraud for one party, and very little for the other.  So, yes, adjudication is an opportunity for fraud, but is it likely to impact an election, no.

Basic Cyber Security

The report included a number of other findings that are probably very concerning to those unfamiliar with cybersecurity practice.  And, honestly, my elaborating might make you even less comfortable with the safety and security of not only our elections, but also your every day life.  But, here goes anyway.

The report mentions several major flaws in the security of the Dominion systems.  This includes the hard-drives not using encryption, shared administrator credentials, unsecured database files, overdue security updates, etc.  This is an all too common reality.  I'm not sure who maintains the voting machines, but if it's local or even state government then I'm not surprised.  Few non-federal government agencies can afford high quality cyber security and systems administration talent.  States can and should, but for cities and counties that is a hard budget item.  So, it is likely that your local DMV, your city utilities, your local police department, and so forth all have similar security failures.  This excuses nothing.  It's BAD.  It is really BAD.  I only point it out so you understand why I'm not surprised by it.

Am I concerned by it?  Yes, and no.  This goes back to the issue with adjudication.  These machines are air-gapped (though ASOG found they could be connected to the internet, they are not supposed to be).  So to take advantage of any of these issues would require an advanced threat - a government sponsored level of effort.  Could China or Russia use issues like these to impact our elections?  Yes, they could.  But they have far easier ways to influence things.  If you doubt they influence our elections already, you're living under a rock.  They don't even need to change the outcome; all they need to do is make us question it.

A Lesson in Credibility

OK.  So far we've given ASOG a free pass by assuming they are honest, know what they are doing, and know what they are talking about.  If you are a trump supporter who wants to believe he can still win, you might want to stop reading at this point, because you probably won't believe half of what I have to say anyway.

There is a LOT in this report to scoff at.  From unprofessional delivery to outright false statements.  It all calls into serious question the validity of ASOG and their trustworthiness.  It's sad because 90% of this has little to do with the analysis they were hired to perform.  They could have left these things out, made a strong case, and come out looking very professional.  But they did not. 

Credentials, or the lack thereof

Any good expert testimony starts with a declaration of the expert's "expertness".  This one, not so much.  Russell Ramsland, a member of the management team, starts off by stating his personal credentials.  Harvard MBA, political science degree, some name dropping of who he's worked with/for (NASA and MIT).  Note, nowhere does he mention cyber analytics or election systems.  He goes on to paint a pretty picture of his team of experts, but fails to name any of them or state any of their credentials.  They are simply a "group of globally engaged professionals who come from various  disciplines to include Department of Defense, Secret Service, Department of Homeland Security, and the Central Intelligence Agency."  In short, "we're really cool experts, but you'll just have to trust us."  

Despite my best efforts, I can't find anything about who they actually are.  No linked-in profiles, no "about us" on their webpage.  Nothing.  Their web domain was purchased on 1/30/2020, so they have at least existed longer than a few months.  It is registered to Allied Special Operations Group (slightly different name) which has their own website alliedspecialops.us which was purchased in 2017.  Long story short, I don't think they were created for the sake of this analysis.  It is not unusual for cyber security consulting companies to be very secretive about their employees.  So we can give them some degree of a pass here.

Professionalism

When you write a professional analysis, it is critical to focus on facts.  You avoid value judgements, and especially avoid assigning intentions as these are highly subjective and opinion based.  In some cases it might be appropriate to make statements about intent, but these are almost always qualified.  For example "this could indicate malicious intent" rather than "this is a clear indication of malicious intent".  ASOG clearly doesn't think this is important.  I'll quote a few parts of their report with my commentary.

"We conclude that the Dominion Voting System is intentionally and purposefully
designed with inherent errors to create systemic fraud and influence election
results."

and

"The systemic errors are intentionally designed to create errors in order to push a high volume of ballots to bulk adjudication." 

Not only do they make the value statement that the errors are intentional (how can you know that, and where is the evidence?), but they also go so far as to state the reason for the intentional errors as fact.  This was done to "create systemic fraud and influence election results".  They offer no evidence or reasoning for their conclusion.  So, basically, we are to believe that a company that builds election software intentionally makes it poorly so that it can be used to do exactly what it purports to prevent?


A purposeful lack of providing basic computer security updates in the system software...

In this case, they are claiming that whoever is responsible for maintaining the dominion equipment (likely a government employee or employee of a government contractor) was intentionally avoiding good security.  They don't even indicate who would be responsible for this, but have no problem accusing that person(s) of intentional poor security explicitly to invite fraud.  This is quite the deep conspiracy now.

Dubious Claims

Some of their conclusions are just strange to me.  In many cases, they appear to be crafted to make good headlines, rather than to accurately represent facts.  (see how I used that qualifier :) )

For example, they talk about a 68% error rate.  The implication seems to be that the system messed up on 68% of ballots.  However, a careful reading shows something quite different.  When inspecting the "tabulation log" they noted that 68% of log file entries were error messages.  This tells us nothing about error frequency.  It simply tells us that the system logs errors more often than it logs other things.  Each ballot does not necessarily result in a log message (~1,500 ballots resulted in more than 15,000 log messages).  I can't say more without seeing the actual logs, but it's possible that all of those error messages were during the system's startup and before any ballots were even processed.  We simply don't know, and ASOG doesn't give us enough information to draw our own conclusions. 

Next, in J.8, ASOG claims that a user attempted to "zero out election results" and provides the following error message as "direct proof of an attempt to tamper with evidence."
Id:3168 EmsLogger - There is no permission to {0}- Project: User: Thread: 189. 
Now, I've done a lot of troubleshooting in my day.  I've read a lot of error messages.  I can tell you that I have NO IDEA what this error message is saying.  It reminds me of what happens when the error message itself has an error.  It appears to be trying to replace "{0}" with the name of the missing permission.  How ASOG interprets this message to be an attempt to zero out anything is beyond me.  Without direct access to the Dominion source code, I don't see any way they could properly interpret that error.

In B.20 ASOG claims that the ICP machines have the ability to connect to the internet.  This is certainly problematic, as they state.  Oddly, they go on to state they connected a network scanner to the ethernet port of the machine and recorded traffic, but then immediately state they do not know the origin or destination of the connection.  This is odd for two reasons.  First, in a proper forensic analysis, you don't connect to, boot, or utilize the initial system hardware.  That corrupts the integrity of the system being studied.  Second, a packet capture would show both the source and destination of the connections being captured.  Even if all traffic were encrypted, the IP addresses involved would be clear.  I have no idea what to make of this claim as it's simply odd and inconsistent with my limited experience with cyber forensics.

Section J.4 tries to suggest something nefarious about write-in ballots being sent to adjudication.  Given that Dominion's software lacks any AI based handwriting recognition, there is really no way to process write-in ballots other than through human interventions.  This is one of the primary purposes of the adjudication functionality.  The attempt to paint it as suspicious suggests a bias in the report's conclusions.

Inconsistent Statements

Throughout the report, ASOG consistently emphasizes the risk of adjudication to election security.  They then make frequent claims (in B.2, B.8, B.10, B.12) that there were a large number of ballots needing adjudication.  At one point even stating "A staggering number of votes required adjudication."   However, they then go on to claim (B.15 and B.21) that the records of adjudicated ballots were deleted from the server which "prevents any form of audit accountability".

Because the intentional high error rate generates large numbers of ballots to be adjudicated by election personnel, we must deduce that bulk adjudication occurred. However, because files and adjudication logs are missing, we have not yet determined where the bulk adjudication occurred or who was responsible for it. Our research continues.

How they know that there were "a staggering number" of adjudicated ballots without the needed records is a mystery left to the reader. 

Outright False Statements

Ok, to top this all off, we get this gem of a paragraph.  It's full of claims that have circulated for a while on social media.  They've been well debunked, but we'll touch on them anyway.
Dominion voting system is a Canadian owned company with global subsidiaries.
It is owned by Staple Street Capital which is in turn owned by UBS Securities
LLC, of which 3 out of their 7 board members are Chinese nationals. The
Dominion software is licensed from Smartmatic which is a Venezuelan owned
and controlled company. Dominion Server locations have been determined to be
in Serbia, Canada, the US, Spain and Germany.

  1. Dominion Voting Systems is NOT Canadian owned.  The company was founded in Canada, but in 2018 it was acquired by Staple Street Capital, which is a US based company.  Dominion themselves have headquarters in both Toronto and Denver.
  2. Staple Street Capital's relationship with UBS Securities LLC is not totally clear to me.  They are an investment firm, and UBS is one of their clients.  From what I can tell, UBS has ~$400M invested with Staple Street Capital.  What's not clear is if that is a $400M ownership interest, or a $400M investment.  It appears, after more reading tonight, that it may have been a stock purchase, suggesting that UBS now owns some share of SSC.  However, what % ownership is represented by $400M is unclear.
  3. Yes, 3 of UBS's board members are Chinese.
  4. Dominion does not license software from Smartmatic.  I've been able to find lots of people claiming this, but zero evidence.  Meanwhile we have Dominion and Smartmatic both publically stating that no such relationship exists.  While it's possible they are lying, it would be pretty disastrous to both companies' interests if they were caught in such a lie.  Without meaningful proof to the contrary, I'm inclined to believe them.
  5. Where Dominion has servers is not relevant to anything as those servers have nothing to do with the tabulation of votes.  From everything I can tell, vote tabulation is strictly air-gapped to ensure security.  Votes can be transferred via flash-drive, CF card, or point-to-point modem connections.  

In Conclusion

ASOG has a major credibility problem, but they are nonetheless making some powerful claims.  Some of these claims are solid enough, and backed with enough evidence, that I believe they demand an investigation.  However, none of this is indicative of wide-spread fraud.  A hand re-count in Georgia of ballots using the same Dominion system found no meaningful discrepancy, which further reduces the chances that there was some wide-spread coordinated effort to impact election results.  I expect this to be somewhere between a local scandal and a non-story.   However, our elections are a HUGE deal.  In addition to a hand re-count in the effected precinct, I would whole heartedly support an FBI investigation into both this local race as well as a small selection of random voting machines from throughout the state.  This is one case where you really can't be too safe.

Finally, President Trump is still in office for another month.  He controls a number of agencies who have the talent and resources to investigate these claims.  He clearly believes them.  He clearly has every reason to want to prove them true.  So, stay tuned.  If you hear nothing meaningful about the FBI, CIA, NSA, CISA, etc. finding that ASOG is right, then you can be quite certain that there was nothing to be found.  There can be no doubt Trump will have multiple armies (maybe even THE Army) investigating this ASAP.







Thursday, December 28, 2017

re:Invent 2017 Debrief

Well, it's been a long month, but I think it's time to finally revisit my re:Invent predictions.  I did better than I had expected.  Predictions in Blue

General Announcements and Themes

Serverless

They'll also announce several new services that are purely serverless in nature.  These services will be around: 1. making serverless more secure, 2. making serverless faster and easier to deploy and test (including a HUGE K8s announcement, 3. serverless monitoring.
I think I nailed this one pretty good.  From the widely expected ECS for K8s, to the new Aurora Serverless database, there was a lot to take in on the serverless front.  However, I think there is still a large hole around security and monitoring for serverless.  It's not that you can't use tools like X-ray and AWS WAF to solve those problems, but I feel like more could be done here.

Security

Security is only going to get hotter over the next few years, and AWS doesn't want you to think about security.  So I expect to see more services (or at least more features for WAF and Inspector) to help make security super easy.  In the spirit of serverless computing, all of these will be fully managed services and many of them will not even require action on the part of users.
I was spot on with this one, and so was AWS.  Watchdog is going to be a super helpful product, and is as easy as it comes.  A few button clicks and you have this massive new security system in place.  I'll be interested to see how well they build the interface.  Those with minimal security understanding will need to be able to grasp watch watchdog does and does not do for them.  It also needs to be crazy easy to understand what actions to take when an event is detected.

Data analytics

I considered machine learning, but I think that's next year.  I do expect to see some very targeted announcements in the poly/rekognition/lex family announced last year, but I think those will mostly be very specific ML tools.  Next year I expect to see some broad ML announcements designed to make custom deep learning models super easy, but I just don't think that's been solved yet.  So this year will focus back on analytics.  Specifically, using serverless computing to process and analyze data super easily.  Among other things, this will involve some new connector services that will help pull together various AWS services more easily.  Expect at least a few of these to be lambda branded.  
Well, I missed the target badly on this one, but I'm glad I did.  AWS overshot my expectations, plain and simple.  SageMaker is not only the coolest named service of all time, but looks to be a HUGE step up in making ML accessible to the masses.  

Specific Announcements

Ok, so I'll probably only get 1-2 of these, but here's what I'm expecting/hoping to see:

New instances types

Specifically, it's time for an M5 instance type and a Machine Learning specific instance type.  Wildcard, maybe a new instance family with fast interconnects specifically targeting super computer users, but I don't think so yet.
Yes M5, no ML specific.  Not fast interconnect yet, but we did get bare metal and a new P class instance.

New Security Services

A new IDS NAT gateway option
A virus scanning something...
So sad there is still no AWS IDS :(

Serverless

A managed kubernetes.  Maybe as a ECS branded thing, but probably it's own thing.
A way to run lambda on your data more directly.... Not exactly sure here, but maybe a lambda tie in with one of the big data offereings?  Redshift maybe.
Meh, it was a stretch.

Blockchain

It's hot, and I can't imagine AWS will miss this train.  Expect a blockchain service that is fully managed.
Maybe my biggest surprise.  AWS is missing this train for now.

Bare Metal

After doing some reading last night, I came across an article by James Hamilton about Oracles pipe dream of competing in the cloud with 1/10th the average investment.  In the comments, he talked a bit about bare metal in the cloud.  Given his comments, I think we'll see this soon.  Maybe not this week, but certainly by next year.  This is basically a service which lets you run your own hypervisor on AWS hardware.  It's the opposite of what AWS wants to do, but would greatly speed cloud migrations for many companies that currently run on VMWare and similar hypervisor systems.
Yup

Multi-Region Aurora

We NEED a multi-region RDBMS.  I don't expect to see this outside Aurora, but I do think we'll see a multi-region option coming to Aurora very soon.  What's not clear is how you overcome the speed of light issues.  How do you ensure 100% data retention while still having millisecond commit time?  My left field prediction is that we will see other database systems (redshift, dynamo, kinesis) also go multi-region.
Well, apparently the speed of light is a problem still ;).  However, multi-region multi-master Dynamo is pretty amazing stuff.  It became more clear to me at re:invent that this just can't be done with Guaranteed consistency models.  But multi-region read-replicas for Aurora is a pretty good step.

Other random stuff

  • Yup, 5 venues was a mess, but I don't think it's going to get better
  • I only rode the buses one day, but it was ugly, and I hear it didn't get a lot better
  • The party was actually not as bad as I expected.  They expanded to 3 tents, but it was still pretty crowded.  They did a good job breaking things out between the tents too.
  • Werner failed me.  WORST KEYNOTE EVER!
  • No MLB this year.  The NFL came instead! :)   Glad to see them finally catching up with this.
  • Well, we have a new certification, but it makes no sense to me.  There was really no need for a "freshman" certification.  I think it de-values the entire cert system.  As a side note, AWS is getting TOO BIG for general certs though.  It may be time to specialize them more into areas like Compute, Data, Network, Serverless/container, etc.
  • No checking gift, but lots more random giveaways.  Overall, I got a pair of buttons, a deeplense, a 10" fire tablet, an echo+ with lightbulb, and some sort of internet button thing that I can't find any documentation on.
  • The hoodies had NO blue on them, and were certainly not better quality :(
  • There were some printer giveaways
  • Netflix had some cool new stickers
  • Tuesday night was actually a lot better than I expected.  Still not as good as James Hamilton, but pretty great nonetheless.

Saturday, November 25, 2017

My re:Invent 2017 Predictions

Re:invent is always full of fun, lessons, tons of swag, and new AWS services.  This will be my 4th re:invent, and I think I'm finally getting into the swing of things.  I'm already too late to bother with a re:inven tips blog post, and let's face it, I have no followers to care, so I'm skipping directly to my predictions.  I'll go with three categories: general themes of the keynotes and announcements, specific service announcements, and random other stuff.  Maybe, just maybe, I'll even follow up with a post to reveal how I did.

General Announcements and Themes

So, let's be straight here, it's really hard to accurately predict the exact new services that will be offered.  I'm still going to try, but lest I look completely incompetent, I'm going to start with some more general predictions.

Serverless

Serverless computing will be a HUGE theme again this year.  AWS has been working to figure out exactly how lambda can play over the last 2 years, and I think they've got the architecture pretty much figured out.  Expect this year to be the year that they declare serverless ready for prime time.  As part of that, not only will they showcase how at least one major company is doing something amazing with serverless, but they'll also announce several new services that are purely serverless in nature.  These services will be around: 1. making serverless more secure, 2. making serverless faster and easier to deploy and test (including a HUGE K8s announcement, 3. serverless monitoring.

Security

Security is only going to get hotter over the next few years, and AWS doesn't want you to think about security.  So I expect to see more services (or at least more features for WAF and Inspector) to help make security super easy.  In the spirit of serverless computing, all of these will be fully managed services and many of them will not even require action on the part of users.

Data analytics

I considered machine learning, but I think that's next year.  I do expect to see some very targeted announcements in the poly/rekognition/lex family announced last year, but I think those will mostly be very specific ML tools.  Next year I expect to see some broad ML announcements designed to make custom deep learning models super easy, but I just don't think that's been solved yet.  So this year will focus back on analytics.  Specifically, using serverless computing to process and analyze data super easily.  Among other things, this will involve some new connector services that will help pull together various AWS services more easily.  Expect at least a few of these to be lambda branded.  

Specific Announcements

Ok, so I'll probably only get 1-2 of these, but here's what I'm expecting/hoping to see:

New instances types

Specifically, it's time for an M5 instance type and a Machine Learning specific instance type.  Wildcard, maybe a new instance family with fast interconnects specifically targeting super computer users, but I don't think so yet.

New Security Services

A new IDS NAT gateway option
A virus scanning something...

Serverless

A managed kubernetes.  Maybe as a ECS branded thing, but probably it's own thing.
A way to run lambda on your data more directly.... Not exactly sure here, but maybe a lambda tie in with one of the big data offereings?  Redshift maybe.

Blockchain

It's hot, and I can't imagine AWS will miss this train.  Expect a blockchain service that is fully managed.

Bare Metal

After doing some reading last night, I came across an article by James Hamilton about Oracles pipe dream of competing in the cloud with 1/10th the average investment.  In the comments, he talked a bit about bare metal in the cloud.  Given his comments, I think we'll see this soon.  Maybe not this week, but certainly by next year.  This is basically a service which lets you run your own hypervisor on AWS hardware.  It's the opposite of what AWS wants to do, but would greatly speed cloud migrations for many companies that currently run on VMWare and similar hypervisor systems.

Multi-Region Aurora

We NEED a multi-region RDBMS.  I don't expect to see this outside Aurora, but I do think we'll see a multi-region option coming to Aurora very soon.  What's not clear is how you overcome the speed of light issues.  How do you ensure 100% data retention while still having millisecond commit time?  My left field prediction is that we will see other database systems (redshift, dynamo, kinesis) also go multi-region.

Other random stuff


  • The new format with 5 venues is going to stink
    • I've got Monday at the Aria, Tue-Thur I'm in the park at Jam sessions (these are great).  Nonetheless, I still expect it to be a giant nightmare.
  • The buss system will be a mess (I'm cheating a bit on this one, my first bus ride this AM stunk)
  • The party is going to be too crowded to enjoy.  The first year they moved outside was pretty good, but last year the tents were just not big enough.  I loved that they put the games in one tent and the music in the other, but with the larger crowd, they'll need at least 3 tents this year to even have a chance.
  • Werner will outdo himself again.  The snowmobile was just amazing last year.  Don't get me wrong, I didn't hear a soul say "I'm so glad they have that now", and I imagine they've got dozens of customers total for that thing, but the wow factor was great.  It was the buzz of the rest of the day.  I expect them to find a way to do something more amazing this year.
  • MLB will be back.  Every year they seem to include MLB in one of their keynotes.  I'm not a baseball fan, but these guys are doing amazing stuff with technology.
  • They'll introduce another specialty exam.  Last year they tried analytics, security, and networking, and two of the three were a success.  This year I think they'll introduce a serverless exam.
  • Last year we all got a dot at checkin.  This year it will be a pair of echo buttons.
  • The hoodies will be better than last year, and have more blue on them.
  • At least one vendor will be giving away a 3d printer
  • Netflix will have new stickers
  • The Tuesday night speaker will not be as cool as James Hamilton.  Seriously, go watch this.  I don't care who you are, you can't top that.

Wednesday, January 27, 2016

Cloudformation on the Commandline

This post is mostly a reminder for me, as I have this issue often enough to kill me, but rarely enough that I always forget the solution.

When you have a cloud formation template that has a parameter of type `list`, passing a value to that on the commandline is NOT intuitive.

"Parameters": {
  "Subnets": {
    "Type": "List<AWS::EC2::Subnet::Id>
  }
}

Here's the trick.

myprompt>aws cloudformation create-stack --stack-name some-stack --template-url https://s3.amazonaws.com/.....   --parameters ParameterKey=Subnets,ParameterValue=\'subnet-2353264,subnet-6513946131\'

NOTICE that we did not put any spaces in the subnet list and that we had to quote it, but not just quote it, we had to escape those quotes too.

Thanks to https://github.com/aws/aws-cli/issues/870 for getting me pointed in this direction.

Friday, October 10, 2014

Saving some cash - (Multi domain wildcard certificates)

Passion

I love to save money.  It's best when I save my own money, but saving my company money is great too.  It's part of why I love my job.  As a Develop Happiness Engineer, my job is to make developers lives easier, so they can be more productive, and so that we can hire fewer of them to do the same great work.  Really, my job boils down to increasing productivity, reducing server count, and keeping everything repeatable and audit-able.

Certificates

One of the necessary evil costs of running a professional website is encrypting traffic.  When you're a healthcare company, this becomes even more critical.  After all, this is pretty sensitive data.  But certificates are not free.  What's worse, I discovered yesterday that even the expensive ones only cover one level of sub-domains.

You see, I was reviewing our URL structure, and realized it was ugly.  We had things like myapp-staging.mydomain.com in one place, and things like staging-someapp.mydomain.com in others.  To make it worse, our route 53 hosted zone had become way too big, and it was getting hard to manage.  So I decided it was time to cleanup the hierarchy and use subdomains to provide some structure to our URLs.

The Problem

Well, we have a wildcard domain cert through a major certificate provider.  It applies to *.mydomain.com.  When I went to create www.develop.mydomain.com I discovered that our cert was not valid for it.  Yep, it only covers one level of subdomain.  So my grand plan was going to cost us a couple thousand dollars by the time I covered all 7+ subdomains we needed.  Lucky for me, only one of our domains is actually used by external customers.  Everything else is internal to our system, or accessed by our own employees.

Self-signed Certificates to the Rescue?

So that means we can use self-signed certs, right?  Well, yes, but how many of them?  And how to distribute them?  Yuk.  If only I could get ONE certificate that covered ALL of my domains and subdomains across ALL internal infrastructure.

A Better Way

Well, it turns out you can. You just need to create a certificate with Subject Alternative Names.  I found a blog post on self signed SAN certs that did a pretty good job.  I just had to clean it up a little and elaborate by adding wildcard SANs to it.  And without further ado, here is the process.

Creating a multi-domain wildcard certificate

Cautions

First, a quick caution.  When you include ANY SANs in a certificate, you must include ALL of them.  That is, the CN in your cert is ignored if you have even one SAN listed.  So be sure to include your base domain in your san list.

The Extensions File

You  must have a file that declares your certificate extensions.  You can do this in your openssl.cnf file, or you can do it (as I do) in a standalone file.

extensions.cnf

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
[ v3_req ]
basicConstraints = CA:FALSE
keyUsage = nonRepudiation, digitalSignature, keyEncipherment
subjectAltName = @alt_names

[ alt_names ]
DNS.1 = mydomain.com
DNS.2 = *.mydomain.com
DNS.3 = myotherdomain.com
DNS.4 = *.myotherdomain.com
DNS.5 = *.mysubdomain.mydomain.com


Generate the certificate


 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
# First create a key file (you'll need to enter your password 2X
openssl genrsa -des3 -out mykey.key 2048

# Next convert it to a passwordless key (remember that password?)
openssl rsa -in mykey.key -out mykey.rsa

# Now you need to create a signing request
# This requires you to enter a bunch of info
# I include:
# Country Name
# State
# City
# Organization Name
# Email Address
# But leave blank the OU, CN, and Password
openssl req -new -key mykey.rsa -out mykey.csr

# Finally, we generate the actual certificate
openssl x509 -req -extensions v3_req -days 365 -in mykey.csr -signkey mykey.rsa -out mykey.crt -extfile extensions.cnf

And that's it, you now have a certificate you can use to secure any domain you had listed in that extensions.cnf file.

 Good luck!

Monday, May 12, 2014

Musings on Chef

Background

This blog should have been written months ago, but I've been busy with a new job and laziness, and I've been distracted by Cloud Foundry.  Frankly, I've not really written much Chef code in 5 months now.  So, now that I'm feeling some Chef pain again, I'm back to looking for a way to make life better.  My frustrations largely center in a few areas of integration between cookbooks, as well as in the management of version-less components of Chef.
  1. Roles are not versioned, which means I can't change the behavior of a role in one environment without impacting others.
  2. Environments, nodes, and databags are not versioned.  
  3. Cross-cutting concerns are everywhere.  Ports, paths, logs, services, etc. I need a way for my cookbooks to communicate these cross-cutting concerns.
  4. Dependency management.  I like to keep my cookbook dependencies clean, but when you have dozens of cookbooks, that gets pretty challenging.  

The Details

Roles

Roles are the forgotten stepchild of Chef.  They were never really needed, and were never done right.  What's a Role?  Well, it's a collection of recipes, other roles, and attributes.  What's a cookbook?  It's a collection of recipes and attributes (plus some other stuff).  In other words, a Role is just a castrated cookbook.  They'd be useful if they hadn't been so leaned down as to remove the versioning.  With versions missing, there is no way to migrate a change in your roles from develop to staging to production.  So people resort to all sorts of odd behavior to allow this.  At my current employer, we have roles with name_<environment> type names.  It's a hack, and it's ugly, and it makes it really easy to mess something up.  Besides, a major reason for Chef is to ensure consistency between environments.  My answer, don't use roles.  Just use very simple cookbooks that wrap other cookbooks and provide the function of roles, while also providing versions.  

NOTE: Yes, I know that this prevents you from doing any searches on roles, and likely hides most of the runlist from doing searches for recipes too.  However, you really shouldn't be searching roles and recipes anyway.  They are the HOW of a node.  You should be searching the WHAT of a node.  What it does, or provides.  See the section on cross cutting concerns to see why I don't need to search for roles and recipes. 

Environments, Nodes, and Databags

Mostly this is a workflow issue as I really only care about having them in version control, not so much about having them version numbered.  Unlike Roles, you don't really move environments and nodes from dev to staging, etc.  Databags you could argue either way, and I honestly haven't used them enough to really comment.  I'm still working on a good solution to this, but for now I keep these items locally on my machine in a git repo.  Any changes require a commit before they are upload to the server.  Honestly, I wish there were a way to have the Chef server use a git repo for these items.

Cross-Cutting Concerns

 I hate dependencies, and try to limit the number of cookbooks that each of my cookbooks require.  Largely because I don't want to tell you how to do things.  Thus, my cookbook should tell you about the logs it creates, the services it provides, the ports it uses, etc.  However, my cookbook should not open ports, process logs, etc.  The challenge is a consistent language between cookbooks to ensure that my firewall cookbook can find all of my ports.  The folks at Infochimps created the silverware cookbook, which goes a long ways toward bringing an aspect oriented approach to cookbook development, but I found their cookbook to be pretty intense, and I had trouble trying to learn it.  So I'm starting development on my own Spicerack cookbook.  The intent is to provide a set of small libraries that will allow for the easy sharing of information between cookbooks.  Presently, I intend to support ports, services, endpoints, logfiles, and possibly directories.  The initial test/use case will be wiring together some web-services with a firewall cookbook of some sort, logstash, and kibana.

Dependency Management

As I already mentioned, I hate dependencies, and do my best to avoid them.  But I also believe in DRY compartmentalized code, so I have plenty of dependencies.  Spork from Etsy goes a long way to making version management of cookbooks easier.  It also makes managing the versions used by each environment easier.  What it doesn't do yet, nor does anything else I've found, is handle the dependencies between cookbooks.  When I update a library cookbook, I'd love a tool that would give me a list of all cookbooks in my library that depend on that updated cookbook.  I could then choose whether to update the dependency (and thus the version # for each of those cookbooks), or leave it alone.  
That said, I'm becoming increasingly convinced that this is a smell in my process.  I've become quite fond of semantic versioning and ~> constraints.  I'm hoping that the combination of these two will go a long way toward making dependency management easier.  Nonetheless, knowing which cookbooks need to use my update would be great.

So there you have it.  Coming soon, How Docker and Conf.d are Changing my Entire DevOps Approach.

Wednesday, February 26, 2014

Growing Pains (and Joys)

Well, I've been on the job for 2 days now, and I'm feeling the pain of new technologies.  Believe it or not, I just got my first smart phone on Monday.  I'm cheap, and have never been able to bring myself to pay more than $20 a month for cell coverage, so smartphones were out of the question.  I can say that learning to use an android phone has been a challenge, but a wonderful experience.  I feel more connected, and love the ability to access things when I'm away from home.  I'm sure I'll feel the leash that comes with it soon enough, but overall I'm thrilled to finally be carrying a modern phone.
My new laptop, however, is another story.  I got my first mac yesterday, a nice new macbook pro.  I will say that the size and battery life are a wonderful step up from the Dell I had when I worked at Lockheed.  Learning a new OS, however, is killing me.  I feel kinda clueless when I have to use Google or ask a friend just to figure out simple things like, where is my terminal?  How do install apps?  I don't care what they say, the idea that Apple products "just work" is laughable to me.  I'm still trying to find an email client that I'm happy with.  That said, it's been one day, so I'm sure that OS X will grow on me.  The sad thing is that I'm probably a month away from hating Windows and OS X.  I'll be one of those dual users who is always wishing for the features of the OS I'm not using at the moment.
The final challenge, which I will start today, is the new software stack.  CareKinesis is using a lot of tools that I've only read about.  I'm thrilled for the challenge of learning new software, and the joy of finding great ways to use it to make myself more effective, but I'm also very aware of how mentally straining it will be to learn 2 or 3 new tools per day for a few weeks.  Meanwhile, my 2 man team will be making some major decisions in the next few weeks about some additional tech we want to add to the stack.  There will be a lot of brain drain for a while, but it's going to be a fun ride.