I began thinking about review scores earlier this week, leading me down a variety of little thoughtful paths, all of which I’m going to talk about here in what can only be described an uncoordinated rant that covers several subjects, some of which are even related! Sort of. This isn’t an article on whether or not reviews should have scores, though, because quite honestly I very much enjoy using a scoring system when writing reviews, even if I also do understand why others believe they can be hinderance. To me a score is a helpful summary of what I’ve written that should never take presidence over the actual text. My goal when writing is to justify the final score, to make people understand why I feel the numerical value I’ve assigned to the game makes sense. A score should add to a review, not take away from it.
But I digress, something which I’ll probably end up doing a fair bit.. The reason review scores came into my frame of thought is a little strange. It came through an accusation from someone who had read my review of the S.T.R.I.K.E. 5 keyboard and claimed that I had written it simply to appease Mad Catz , saying that clearly any sensible person would be a far cheaper keyboard that has the same benefits and therefore my review was utter bollox. Like any such accusation, it cut deep because the very reason I own this site and write about games is because I love doing it, and because I wanted to try to provide the most honest views I could. Distrust of big websites is at an all time high, so it’s up to the smaller sites to try to provide the most thoughtful reviews they can. I was also baffled because in the review I did mention that a sensible person should indeed go out and by a cheaper keyboard with all the practical benefits the S.T.R.I.K.E. 5 had because you can pick one of those up for considerably less. Yet I still gave the keyboard a 4 out of 5, because I felt it wasn’t a product for sensible people: it had all the practical benefits a gamer wants, but it also has a load of pretty impractical features too for geeks like me who just want cool stuff. And anyway, for me the price of something doesn’t affect my final verdict and score, because after all what people view as good value for money differs greatly.
Still, the accusation worried me. Maybe I hadn’t got my point across well enough. More importantly, this was the third Mad Catz product in the space of a about 2-months, two of which I had given a rating of four out of five and the third a three.. When I considered this I could see from his or her perspective how it might have looked. Of course, from my perspective the reason I had a sudden influx of products from Mad Catz was simple: the start of 2013 was the first time I had managed to get in contact with the company in a long while. I had previously written up a review for their Ghost Recon headset in early 2012 as well as the controller, but presumably they didn’t like my style at the time and I didn’t hear anything from them until I contacted them recently again and was shocked when they sent me out the F.R.E.Q. 7 headset. I wrote the review for that, published it and then sent the PR rep over the review link as that’s the polite thing to do. He was nice, complimented my work and even thanked me for making a few points in the review that they hadn’t thought of, like having adjustable volume for each ear cup, meaning he had actually bothered to read what I had written, which is a rare thing, I find. Because he was happy with the quality of the review he was willing to send out more products. As such I’ve currently got a good relationship with the company.
You see, there’s usually two types of publishers or company out there: the first kind relies purely on numbers, by which I mean they won’t even talk to you if your site doesn’t have the minimum amount of traffic passing through it. It doesn’t matter about the quality of review, it matters about the numbers. Capcom are an example of this, which explains why I only had reviews up for Capcom games I’ve personally bought. The second type goes by quality of review (though good traffic still helps), by which I mean if the PR rep feels that your writing is of a good standard then they’re willing to work with you more in the future.
From my perspective the second type is of course easiest to work with, because I’ve not got the traffic to attract those that go purely by the numbers. I’m not claiming my writing is of very high-quality, but it’s good enough that many publishers are willing to work with me, a face of which I’m pretty damn proud. The problem is from the outside it’s so easy for it to look like the companies aren’t working with you because of quality, but because you’re handing out big scores. A cynical part of me says maybe that is the case at times. Having said that, I’ve been blacklisted a few times as well from publishers who weren’t chuffed when I sent them in a link to a review with a low score, and I wouldn’t take those scores back for any amount of money or free gear. The developers of Amy weren’t very happy with my review, for example. The publisher of it refused to talk to me for quite a while. In my defense, the game was crap. Really crap. Like, painfully crap. Not even funny crap, just crap. God, I hate that game.
So, I thought about how it must have looked to my accuser: three reviews in quick succession of one company’s products with good reviews. I get where the person is coming from, then, but ultimately I chose to make a polite reply and leave it there, because these kind of things are to be expected and I’ve got to learn to deal with it. At the end of the day, I know I gave those products those scores because they felt great to use, and I feel like I managed to make that point well enough. I know I was honest about my thoughts. I’ve still got a long way to go before I think I could consider myself even a good writer, though. Maybe I didn’t get my points across as well as I thought I did, or could have done if my meagre wordsmithing talents were even a tiny bit better.
Anyway, the point was that this did get me thinking about the scores, namely that I seem to be handing out a lot of fours lately. At the start of 2013 I switched scoring system to a simpler out of five method, from the out of ten that I was using. Since that switch the majority of scores I’ve put out seem to be fours, which is odd, because naturally you’d assume it would be a three, which equals a good game, or in other words the general average quality that we see within the industry these days. The reason for these fours is pretty simple I reckon: the average quality of products and games is pretty high. It’s not often I play a game I could actually consider bad any more, though obviously they still exist, lurking in the shadows and waiting us to lower our guard and open our wallets. In fact most of them are…great. four out of five.
So does that mean I need to readjust my scoring to compensate for all of these fours running around the place? Should I now start handing out threes to games I’d usually consider a four? Many gaming sites claim that review scores are an ever-moving target, and that’s true to a degree, but it makes it a bit confusing for the reader: exactly how much has the average bar been raised this time? What now warrants a three out of five, when it used to warrant a four? This is one of many different reasons as to why there’s so much arguing over IGN and Gamespot review scores being inconsistent. You might review a game and hand it the highest possible score, but if you reviewed it 6-months later it might score considerably lower, because now other games are doing things just as well, thereby raising expectations. Sequels are also a great example: you see many people leaving abusive comments because a sequel is described as better than the original, and yet the score is either a bit lower than the original or only a little better. And that’s simply because while the sequel is considerably better than the original, games in general have moved on and in comparison to those being released around it the sequel may not hold up. Scores reflect the quality of games as of the moment of writing, and thus massively improved sequels can score the same or even lower as it’s predecessor. It’s up to the author to make this clear in the review.
Anyway, because of this I’ve tried to keep my system as stable as possible. Obviously, though, I’ve had to adjust it a few times, because if you didn’t do that and kept the same average quality level then you’d end up with some confusing reviews. For example, something like Gears of War when it first came out was amazing, but if you reviewed it today then it would do far less well and for good reason; there are loads of games that do the cover system and action just as well these days . If you’d kept your same quality standards, though, it would get the same score as it would have back in 2006, despite the industry having moved forward a good bit since then, a score which would misrepresent the game as something outstanding when it no longer is. So scores have to change, yet I believe also remain as stable as possible, because a continuous moving bar essentially renders a numerical value moot as nobody is sure what it means anymore without very clear context being given.
But by trying to keep a steady score the end result is that now the average is coming across as a 4 on this, and that’s not right, clearly. Or is it? I considered dropping my scores when I swapped to the new system. Last year I even made the conscious decision to try to be a bit harsher in my reviews because I felt I was being too lenient, since I’m the type of gamer that can find a degree of fun in even the worst titles (Amy notwithstanding) and thus struggle to give low very low scores. But then I came to the conclusion that such a choice wouldn’t do justice to a lot of the games I’ve reviewed. Imagine if I dropped the scores by a point to adjust, then something like Tomb Raider would have a score of three, equalling a good game, even though I firmly believe it’s a great game, hence the current score of four. BioShock Infinite would be a four, equalling great, even though I firmly believe it’s absolutely outstanding.
If you hadn’t already figured it out, then, I’m at something of an impasse on how I feel about scores. From every angle I approach the argument there’s good, solid points on every side. Adjusting it would make sense in reflecting the average quality, but then the average quality really is actually “great” these days, so why change it? I’d have to change the descriptive words I use as well, and the very ways I think about a game so that both text and score would match. But then, aren’t I always changing the way I think about games as previously fresh mechanics become commonplace and games which once did a certain thing very well find themselves amidst a sea of other titles now doing that same thing just as well, or sometimes doing it even better? And that made me realise how subliminal the way I think about games, scoring and average quality must be: I’ve been adjusting it as I go along all the time, without really realising it, and you’ve all been doing it too. Without ever thinking about it. We must have been, otherwise we’d end up with some odd scores around the place. If I reviewed the original Gears of War now I’d probably score it a three, when back in the day I would have likely handed it a four and a half or even a five because it was innovative and fresh. Ergo, I have changed the way I judge quality, without thinking about it.
So at this point I’m leaning toward not reworking my system and thought patterns. They’re doing a good job by themselves, clearly. And anyway, we’re near the next-generation, so a score shift seems pointless at the moment.
Still, a deliberate change is coming in the form of the next-generation. Next-gen games demand a re-think of the scoring system because they should be by default better than our current games, or that’s the theory, at least. As such if we didn’t adjust our expectations the majority of next-gen titles would be getting max scores, and that’d be a bit barmy. This comes with a myriad of problems, though, and it’s a myriad I’d like to see some of the big sites talk about so that people could get a clearer idea of how their review process works, a look in at how the bar is raised or lowered.
The first problem is figuring out what the average quality of games for the new platforms actually is, and to do that you’ve got to play a considerable amount of titles. You can’t just play three next–gen games and declare you know exactly what the average quality is, and therefore what the average score will be. And that’s probably going to be a struggle, because traditionally there’s not a large amount of games usually on offer for new consoles. And then there’s another factor to take into account, which is that the quality bar will again have to be raised not very long after the launch of the next-gen consoles, because early next-gen titles won’t be a very good indicator of what can actually be done by developers. Initial launch titles will be composed of games solely created for the next-gen machines by developers still getting to grips with the technology and therefore not capable of producing their best work, and of games ported over to next-gen consoles from current-gen ones, which therefore obviously aren’t an indication of real quality either. As a result of all of this as the months go on the quality bar will probably fluctuate wildly as developers get grips with the tech, which is obviously going to be a pain for people reading reviews as they’ll be trying to figure out just where that metaphorical bar is sitting today. And who knows how long that fluctuation will last? The benefit of having such a long console-cycle this time around was that the quality bar sorted itself out quite a while ago, leaving us to just move it incrementally as time went by and the pace at which developers improved slowed.
Over time review scores will once again sort themselves out. But in the mean time I wish massive sites like IGN and Gamespot would start being more transparent with readers about how scoring works, about how quality standards change and even about why scores are different between genres and thus should never been compared, that way there might be less daft Internet arguments about how one game scored an eight while the other got a six. For example, people complaining on reviews about how an FPS got a 9 while the third-person-action game got a 7, arguing that the adventure game is so much better than the FPS and therefore should have scored better. The thing is the action game is judged on different standards that are set within in its own genre, likewise the FPS game is judged and scored based upon its field of games. The score any given game gets is only truly relevant and comparable to other scores within its own genre, though there are of course some exceptions to that rule.
So, was there actually a genuine point to this baffling rant? Mostly I just wanted to explain why I seem to be handing out a lot of great scores, lately: because the average quality of games is pretty high these days. And also because I just wanted to briefly chat about the ever-moving review score bar and how it’s both unconsciously and consciously adjusted by regular gamers and critics alike. Along the way we somehow even detoured into publisher country for very little reason, though in some strange way I hope it perhaps gives you an indication of what the strange relationship between publishers and the rest of the world is, and perhaps some part of me wanted to justify the entire small wave of Mad Catz product reviews, because even though I have to accept it’s going to happen people calling me dishonest still stings.
To cut this story a touch shorter, though, there’s no real point I’m making here, except perhaps that you should how review scores work the next time you’re thinking about making some horrible comment on Youtube or IGN or something. Review scores are an ever-changing, fluid concept, one which is made even more confusing by the involvement of personal opinion, upon which every review is based. It’s the job of a reviewer, then, to adequately explain and justify their opinion and the numerical value they’ve assigned to it, while always accounting for that bloody moving bar which indicates what the average quality of games is at the time of writing.
And finally, if I’m handing out more fours than normal, should I indeed be considering adjusting my scores? At this point, no, because I feel it’s fair to show that the average quality of our games is currently pretty damn high, and that we are all better off because of it. Once the next-gen hits, though, all bets are off.