We run a news portal (lahaine.org) heavily customized with about 70 000 posts. We started with b2, moved to b2evo 0.9 and now runs b2evo 2.4.7. We are now thinking in update to b2evo 5 but there is specially one question about performance.
We get about 50 000 pages read per day, and in special events it can grow up very much. 3 or 4 years ago we began notice serious performance issues, so we investigated and discovered that the problem was table evo_postcats. With 70 000 posts, for sure postcats will be more than 100 000 records, and we think the process of reading postcats every time items__item is read was not really optimized, at least in 2.4 and at least for a big number of posts. So we come to a temporal solution: on special occasions, we deleted most records from postcats, leaving just the last 5 000 or so, and the performance increased exponentially: from about 45 seconds per page to a couple of seconds.
We don't know exactly why that happens. We know that in a normal blog with, say, less than 10 000 post it's nice to assign posts to different categories or even blogs, and of course it is mandatory in a standard CMS. But in a special purpose system like ours we can decide how many subcats will we allow, so we decided to completely eliminate table postcats. We added 3 fields to items__item: blog, subcat1 and subcat2 and now the performance is acceptable in any circumstances we have run through.
So, now when we are beginning to prepare the move to b2evo 5, the question is if it will be necessary to make the same modifications to our system, or you think that the core system improvements in the new versions of b2evo are enough to enhance performance to an acceptable level for, say, 100 000 posts. Or may be we are doing something wrong and there is another solution not so radical...
I was working with a 240k posts / 40k categories system, and I confirm that b2evo loads all categories into memory. It even does it twice in some situations! I added a patch to v5 that improves categories loading logic.
The problem is actually in loading a lot of data into PHP memory, this particular issue has not been fixed since v2. However comparing to v2 there are a lot of other changes that improve page load time and minimize memory usage.
Yes, you will need to apply your changes to v5 after upgrade. Let me know if you need help upgrading or tuning up the system after upgrade.
Just a note, Wordpress or Joomla aren't optimized for large datasets too, they are even slower than b2evolution.