# Difference between revisions of "Forum:MMG - Long-term profit warning"

m (→Discussion) |
Chessmaster (talk | contribs) m (Clarification on anyone being able to add the confidence intervals) |
||

Line 188: | Line 188: | ||

'''Support 1, small support 3''' - The template Myles put together looks good. As for the percentile tables, althought they may be a good addition, I'm not sure how many people would be able to add these to the guides. There should also be some sort of column added to the guide table on [[Money making guide]] to see that the guide may not be the best for a quick profit at a glance. {{Signatures/BlackHawk}} 06:58, 30 September 2021 (UTC) |
'''Support 1, small support 3''' - The template Myles put together looks good. As for the percentile tables, althought they may be a good addition, I'm not sure how many people would be able to add these to the guides. There should also be some sort of column added to the guide table on [[Money making guide]] to see that the guide may not be the best for a quick profit at a glance. {{Signatures/BlackHawk}} 06:58, 30 September 2021 (UTC) |
||

+ | :Anyone will be able to "add them to the guides", as that will simply involve adding a parameter along the lines of '''''isRandom = ''' true'', per the first point of the proposal. The rest will be handled by scripts. {{Signatures/Chessmaster}} 11:27, 30 September 2021 (UTC) |

## Latest revision as of 11:27, 30 September 2021

Hello. While looking up some data on GP/hr in the moneymaking guides, something caught my eye: there is no clear "warning" for long-term profit moneymaking guides. Allow me to illustrate what I mean with an example: the Killing Soulgazers guide lists a profit of (at the time of writing) ~9.2m gp/hr. However, this profit is only true under the assumption that a player sticks around to kill the average of 7165 soulgazers mentioned in the guide, which takes give or take 36 hours (rounded up). After all, you cannot physically get 0.026th of a hexhunter bow for killing one hour's worth of soulgazers.

Therefore I propose two solutions to clearly inform players of this for any MMGs it applies to:

- Create a template similar to
`{{MMGWarning}}`

to inform players most profit comes from rare drops that take a significant amount of time/kills to obtain - Divide the MMG into two clearly-identified sections:
- Hourly profit without the rares
- Hourly profit with the rares, informing players how long it takes to obtain the profit mentioned

**EDIT**: Please check out the proposal by Chessmaster below. **Dalek Sec** 08:15, 25 September 2021 (UTC)

## Discussion[edit source]

**Support 1 Initially with a move to 2 at convenience** - I think solution 1 could easily be implemented, with a soft role-out to updating the calculators for solution 2.Yurple (talk) 18:08, 14 September 2021 (UTC)

**Comment** - I agree and support the principle of the proposal. However, unless I'm misunderstanding your example, the misleading factor seems to be high Variance rather than time taken. What I mean by that is that spending 36 hours (or 50, 100 etc.) does not guarantee 9.2m gp/hr and in fact unless you spend massive amounts of time there, your real gp/hr will likely be significantly lower or higher based on luck. So I'd say we need to present that somehow. I wouldn't call it *variance* as that's likely too technical a term, but maybe something like *Randomness* or *Luck factor* for each MMG entry, with hover text explaining the rates may vary significantly based on luck?

If we went with that, we could then give it a rating (say, this example could be a 3 out of 3). This doesn't need to be a subjective rating either, variance is straightforward to calculate if we know the approximate drop rates. bad_fetus^{talk} 20:44, 14 September 2021 (UTC)

**Neutral** - I agree with the sentiment. As pointed out above, distinction must be made between methods that take significant amount of time and those that have high variance. Ideally we should provide enough information to the reader so that they can make informed choices based on it. Problem here is that when you think about it, it becomes more and more complicated, and the things that initially seem too technical actually are the simplest way to express these concepts. I.e. to accurately present the information, we should start talking about variance and such. I wouldn't be happy seeing 2 out of 3 luck rating for a money-making method as that would just lead me to question what it precisely means in practice, and it would lead to similar ambiguity to that of most drop tables currently. Separately listing "guaranteed" and "luck-based" averages also has many problems (such as drawing lines what actually is guaranteed). What comes to the first proposition, that can already be covered with the existing template; the risk here is the sunk time with chance to never get the big drops required for profitability, and no new templates are needed. I wouldn't oppose if someone comes up with a concrete solution on how to present the information accurately. Thingummywut (talk) 03:20, 20 September 2021 (UTC)

- Very well said. The only thing that really comes to my mind is displaying the average time it would take to aquire the drop(s) to get to the listed GP/h. This way you don't have to know where the lines lie, because it's tied to a dynamic number rather than two categories (falling in the "guranteed" or "luck-based"). Of course something like this doesn't come without its own problems. For example, what happens when multiple rare drops are involved with different drop rates and prices? Is it ambigious to have something like this when you can see how rare something is by just looking at the drop rate? But on the other hand, the time itself could serve as the warning, or rough estimate of time required. Just a thought!
**Ley**^{Talk}22:32, 23 September 2021 (UTC)- Let's consider a simple example: A monster has guaranteed drop worth 100 coins, 1/100 drop worth 1M coins and 1/1,000 drop worth 10 coins. Average drop is worth 10,100.01 coins. In this scenario we'd say something like "It takes on average 1,000 kills to get all the drops contributing to this average", but it's very misleading as the vast majority is contributed by the 1M drop. Add more drops with more complicated drop mechanics to the mix and such statements get even more difficult to interpret. The average here already accurately represents the expected value of a kill. Things like masterwork smithing that have practically no variance are very different. As stated above, it's actually the variance that gives the information required, and interpreting variance values gets a bit technical for the average reader. It feels to me that the problem is fundamentally complicated, and variance is the simplest accurate solution. Maybe the idea of having "2 out of 3 luck rating" isn't so bad if it's just specified somewhere exactly what kind of variance values this means. Thingummywut (talk) 02:46, 24 September 2021 (UTC)

### PROPOSAL 3[edit source]

I've thought a bit more about what I said above, and it occurred to me we can calculate confidence intervals once we have the variance by asssuming a normal distribution; ultimately letting us display percentiles of outcomes. This should be **a lot more readable** to the average user. So I hacked together some Java code and got some results. Note that just like the average profit we currently list, these values depend on the player's speed at the activity. Additionally, the results depend on how much time the player spends on it. With all that said, here are the example cases:

Soulgazers - High variance

Percentile | 10% | 30% | 50% | 70% | 90% |
---|---|---|---|---|---|

10 hours | -39,430,380 | -39,430,380 | 93,523,886 | 305,302,330 | 416,002,676 |

30 hours | -118,291,140 | -86,239,365 | 280,571,659 | 647,382,684 | 839,121,307 |

100 hours | -84,528,606 | 265,536,623 | 935,238,865 | 1,604,941,108 | 1,955,006,337 |

300 hours | 1,039,427,524 | 1,645,758,286 | 2,805,716,597 | 3,965,674,907 | 4,572,005,670 |

Rorarii - Low variance

Percentile | 10% | 30% | 50% | 70% | 90% |
---|---|---|---|---|---|

10 hours | 24,157,675 | 26,047,203 | 29,662,018 | 33,276,833 | 35,166,361 |

30 hours | 79,452,253 | 82,725,012 | 88,986,055 | 95,247,099 | 98,519,857 |

100 hours | 279,213,924 | 285,189,136 | 296,620,185 | 308,051,234 | 314,026,446 |

300 hours | 859,712,029 | 870,061,399 | 889,860,557 | 909,659,715 | 920,009,086 |

Primal starter - No variance (not random)

Percentile | 10% | 30% | 50% | 70% | 90% |
---|---|---|---|---|---|

10 hours | 152,496,000 | 152,496,000 | 152,496,000 | 152,496,000 | 152,496,000 |

30 hours | 457,488,000 | 457,488,000 | 457,488,000 | 457,488,000 | 457,488,000 |

100 hours | 1,524,960,000 | 1,524,960,000 | 1,524,960,000 | 1,524,960,000 | 1,524,960,000 |

300 hours | 4,574,880,000 | 4,574,880,000 | 4,574,880,000 | 4,574,880,000 | 4,574,880,000 |

A few notes:

- The tables above show profit rather than just raw rewards, hence the negative values if you are unlucky with soulgazers.
- The 50% value will always show the average (which we already have) since we assume a normal distribution.
- The methodology is more reliable for lengthier hours spent, as it relies on the Central Limit Theorem linked above. For example, the soulgazers example above is misleading at the 50% value on the 10 hours row - you have a less than 50% chance of turning profit at that point. All other cells in the table are reasonable, however. Also, the other 2 tables have no such issues as they aren't as extreme.
- Activities that consistently result in multiple notable rewards are edge cases that make it a bit trickier to calculate variance, as I don't think we have data on probabilities of things dropping simultaneously. I'm confident these can be reasonably approximated however.

**With all the nerdy math out of the way, I'd like to propose the following:**

- Add the
*isRandom*field to the template. - If the above field is set to true; display a table similar to below, under a title along the lines of
*"Profit probability distribution"*. Note that this table is only 1 row unlike the above data, for easier readability. (Example case is soulgazers).

Percentiles | 10% | 30% | 50% | 70% | 90% |
---|---|---|---|---|---|

30 hours | -118,291,140 | -86,239,365 | 280,571,659 | 647,382,684 | 839,121,307 |

- Add a custom input field similar to the
*kills per hour*that can be found here. The new field will allow the user to change the number of hours they plan to spend on the activity, allowing them to see a personalised distribution.*Ideally*, the initial value of the field will be based on the variance. (e.g. 10 for rorarii, but something like 30 for soulgazers, as we want representative numbers, per #3 above). Might also want to show a warning saying the values are less accurate beyond this point if the user reduces the hours too much.

Note that everything is calculated from pre-existing data already in the template, so this should not be adding any load beyond implementing the math logic and marking things as random or not. As I already have the logic implemented in Java, I can help implement the changes to the module if this passes (but realistically I'd need help, as I've never touched the module stuff on the wiki before).

**Amendment** - To handle low sample size cases as Cook pointed out above, I've written down an improved algorithm here. Note that this only affects how the logic is implemented and the overall proposal to show confidence intervals has not changed. bad_fetus^{talk} 20:35, 27 September 2021 (UTC)

**Support 3** - as nominator. bad_fetus^{talk} 03:33, 25 September 2021 (UTC)

- This is partially a good idea but I think it would be a mistake to assume the normal distribution. CLT won't really start to be the dominant factor for some of these until many hundreds of hours. It would probably be better to compute the CDF explicitly, on-demand for various killcount, and put it on a graph.
**ʞ***o***o***ɔ*10:01, 25 September 2021 (UTC)- Valid point about CLT being a bad approximation for very small expected numbers of a drop. Stats is hard for me... is treating each drop in the table as independent a valid approximation, and then just sum up the quantiles*prices? Wte81 (talk) 19:35, 27 September 2021 (UTC)
- Treating each drop as independent is an
*excellent*approximation. However, there is no easy way to combine the data from there. Summing up the percentiles does not work. As an example, you can consider rolling 3 d10s. The 10% value for each individual d10 is 1. If you sum those up, you'll get 3; but 3 is the 0.1% percentile value for 3d10 as it's (10%)^{3}. - For Cook's point, I hadn't originally considered that low sample sizes would be relevant as I only looked at slayer monsters. It does take a lot of hours to get valid results for things like endgame bosses as you said. I have now added a different way to handle those cases in the above amendment. It should now be able to handle any sample size without issue. bad_fetus
^{talk}20:35, 27 September 2021 (UTC)

- Treating each drop as independent is an

- Valid point about CLT being a bad approximation for very small expected numbers of a drop. Stats is hard for me... is treating each drop in the table as independent a valid approximation, and then just sum up the quantiles*prices? Wte81 (talk) 19:35, 27 September 2021 (UTC)

**Support 3** - I'll be the first to admit I suck at math, so I cannot verify all the math above. However, assuming it's correct, it actually makes sense to me. It's a clean solution that tells the users a lot without confusing them too much. **Dalek Sec** 08:09, 25 September 2021 (UTC)

**Support** Cowcow (talk) 19:14, 26 September 2021 (UTC)

**Support** - As long as things are kept simple enough for editors and readers. Thingummywut (talk) 22:31, 26 September 2021 (UTC)

**Comment** - Cook was thinking about presenting the data in a graph format rather than a table, so I generated some cdf graphs to see how that might look: Rorarii 10 hours, Nex:AOD 10 hours, Soulgazers 10 hours. I feel a lot users will be confused by these graphs rather than helped, but let me know if you have an opinion on it. We could also have both the table and the graph if people think there's merit in both. bad_fetus^{talk} 22:59, 27 September 2021 (UTC)

**Support 1 & 3** - I like the idea of having the table showing the variance. It is very useful information. However, I would personally still like to see a warning (or rather "notice" I suppose since warning sounds harsh) that the specific method involves a degree of variance, and may not be suitable for short term profit. Example messagebox:

⚞Myles Prower⚟ 14:36, 29 September 2021 (UTC)

**Comment for Proposal 3** - I like having a statistical base for variance, and have charts and graphs on the individual pages can help communicate that variance. However, for a casual user browsing Money making guide, I'd like to understand how we would succinctly present the relevant information for a user to find a moneymaking guide that suits their needs, given this additional information.
Aescopalus ^{talk} 14:42, 29 September 2021 (UTC)

**Support 1, small support 3** - The template Myles put together looks good. As for the percentile tables, althought they may be a good addition, I'm not sure how many people would be able to add these to the guides. There should also be some sort of column added to the guide table on Money making guide to see that the guide may not be the best for a quick profit at a glance. **BlackHawk**^{ (Talk)} 06:58, 30 September 2021 (UTC)

- Anyone will be able to "add them to the guides", as that will simply involve adding a parameter along the lines of
, per the first point of the proposal. The rest will be handled by scripts. bad_fetus**isRandom =**true^{talk}11:27, 30 September 2021 (UTC)