Got a late Amazon Prime Package in 2017? Ouch, that was my bad code!
My code broke affecting thousands of customers, yet I got promoted.
It was Amazon’s Prime Day in 2017.
Many customers who placed orders having some special kind of discount missed the 'prime 1-day delivery promise' and received their orders up to a day late.
My code was the culprit..
This was my promotion project, and I messed it up. Interestingly, I still got promoted after a quarter.
Today, I am sharing a personal experience that will help you understand that sometimes while striving for growth and promotion in your career, you might face setbacks. However, if you handle difficult situations well and have strong leadership that trusts you, you can emerge stronger.
👋 Hey there, I am Gourav. I write about Engineering, Productivity, Thought Leadership, and the Mysteries of the mind!
🧨 What happened?
Amazon India was an emerging market.🚀
When I joined Amazon India in 2016, the market was just emerging. I became a member of a new Payments team responsible for launching features to make purchases more affordable for customers. One of our major projects was "Instant Discount" which provided discounts for using partner credit cards. This feature was expected to drive significant revenue.
Our team was in startup mode, moving fast, breaking, and fixing things. The architecture was complex and evolving.
Prime Day Rush.🤹♂️
I designed and implemented a new payment processor service. I implemented it to handle discounts on orders. I wrote unit and integration tests for the service and also manually tested it.
Confident in our stable state, we declared "Go time" for the 2017 Amazon Prime sale.
The entire company prepared for the heavy surge in traffic, expecting around 100,000 transactions per minute. We scaled up our services, increased autoscaling capacity, and added more memory and CPU to the servers.
Then came the D-Day.
Finally, the sale started, and we had an "Ouch" moment. 💥
Many orders with instant discounts started failing in the service. This caused payment workflows to get stuck and delayed order shipping.
The issue? I wrote flawed code in the payment processor service, using Double instead of BigDecimal type in Java. This caused precision errors with non-whole prices. It led to incorrect discounts and mismatched totals.
😓 Chaos, Mitigation, and Guilt
We started receiving sev-2 tickets, customer complaints, and business inquiries about the issue. I checked the logs and explained the problem.
Order delays affected customers and the delivery pipeline. Our leadership got involved. My manager, Arun, was a great leader, and my senior engineer, Puneet, was the most respected engineer on the team.
Puneet and I took the lead in resolving the situation. We quickly deployed a hot patch in production during the sale event. But, this wasn't enough to fix workflows stuck in a loop, so we needed manual intervention.
We involved all the engineers, divided the huge list of workflows, and prepared a runbook to resolve the issues. It took us seven hours to fix all the workflows, leaving everyone drained.
I overheard some engineers criticizing me for the mistake, and I felt guilty. 💔
🏋️♀️ How I Still Got Promoted: Learnings and Key Takeaways
I was up for a promotion that quarter, which got delayed for obvious reasons. Yet, my promotion came in the following quarter. Here’s why I still got promoted quickly:
⚡ Trust
Throughout the project, I progressed in line with the team's expectations. My manager appreciated my speed of execution and proactive communication. We knew there could be sharp edges. But, we moved forward pragmatically. We skipped too much exhaustive testing to meet the timelines.
My senior engineers acknowledged that the service handled normal cases correctly, and only 10% of orders were affected. Due to previous trust, my leadership supported me instead of blaming me. They understood that such reversible mistakes (two-way door decisions) can happen when moving fast.
⭐️ Key takeaway: Build trust by being a go-getter and meeting leadership expectations, and they'll support you in tough times.
🔥 Adaptation and Ownership.
After the incident, I adapted and helped mitigate the issue. I also took a step back to understand the architecture better and added guardrails to make the service more resilient.
Demonstrating ownership in learning from mistakes and improving the service contributed to my promotion.
⭐️ Key takeaway: Always learn from your mistakes, take the lead, set examples, and come out stronger.
🧳 Business Backing
In the next quarter, I not only improved the service but also collaborated with product managers to incorporate more features.
This restored my reputation and earned positive feedback from business leaders.
⭐️ Key takeaway: Build trust with business leaders, keep them informed, and help them achieve their goals.
🏆 Strong Leadership
Understanding and supportive leadership is a blessing. Arun and Puneet helped me focus on improvements rather than dwelling on the mistake. Their guidance was invaluable, and they've remained close friends for life.
⭐️ Key takeaway: Strong leadership is a blessing. If you have supportive leaders, maintain lifelong relationships and strive to become a similar leader.
Shoutouts 🔊
Look at
’s amazing journey writing newsletter: Reflecting on my 6 months of writing this newsletterSupport
by reading Mastering Feature Development in Software EngineeringUnderstand if you want to stay or quit your current job by reading
’s latest: When is the Right Time to Quit?I really loved the heading and content of this article written by
on The real job security these days is knowing you can get another jobBeing a leader means having a backbone, read what
explained related to it: Don’t be a Spineless Leader: 3 Tips to Lead Better.
Let’s Connect 🤝
Sponsorship | Collaboration | LinkedIn | 1:1 Mentoring | Twitter
Gourav Khanijoe
Wow, I was an intern at Amazon in Seattle at the time of Prime Day :D
Thanks for sharing your experience. I can imagine the situation at that time.
I have also similar experiences on small scale. Here one other learning is always boldly present tradeoffs upfront to manager about time vs quality. Also nothing comes with 100% guarantees , so expectations need to be aligned with manager and leaders with accepting downsides of decisions.
Thanks for the shoutout 🙌🏻🙌🏻