You could put up a high end server with multiple CPUS and a lot of memory, and fast disks.
I tried volume testing and a one CPU machine can handle 20k volleys a minute or 3k active users at once, tops.
An 8 CPU machine with lots of memory and fast disk ssd (provisioned IOPS SSD io1) would handle theoretically 150k volleys per minute, or 24k users at once. That is a lot of users and volume. That is with my CS code.
The problem with this architecture is that if you have to scale up and down, you end up with a massive server to handle your peak periods.
Because everything is written to a disk, you need to have only one instance of it, and you need to build something that can handle your peaks. And scaling this live single server up and down is difficult. Unless you know of some tricks.
With a database layer, you can have a load balancer as your front door to your app servers. Have it distribute load to available app servers.
With AWS, you can auto scale your app servers up and down based on load.
So, if your load goes up, you add a bunch of smaller app servers.
When it goes down, auto kill some app servers.
This lowers your cost.
And all app servers oint to the same database server.
The database servers auto scale also, but this is not as fast.
With this architecture, you can scale up and down without impacting the end users experience.
There is a lot more to this, but that is the basic premise.
If someone is starting out, I recommend putting up a load balancer that points to a single app server that writes data locally.
No database. If you have very high demand, migrate to the database option.
You would already have people pointing to the list balancer.
You would need to put up new servers that point to the database.
And as a one time task, migrate the local users to your database.
It is faster to run everything locally, so that is what I recommend as a starter.
Hope this helps.