• 0 Posts
  • 9 Comments
Joined 1 year ago
cake
Cake day: June 25th, 2023

help-circle
  • For one thing: don’t bother with fancy log destinations. Just log to stderr and let your daemon manager take care of directing that where it needs to go. (systemd made life a lot easier in the Linux world).

    Structured logging is overrated since it means you can’t just do the above.

    Per-module (filterable) logging are quite useful, but must be automatic (use __FILE__ or __name__ whatever your language supports) or you will never actually do it. All semi-reasonable languages support some form of either macros-which-capture-the-current-module-and-location or peek-at-the-caller-module-name-and-location.


    One subtle part of logging: never conditionally defer a computation that can fail. Many logging APIs ultimately support something like:

    if (log_level >= INFO) // or <= depending on how levels are numbered
        do_log(INFO, message, arguments...)
    

    This is potentially dangerous - if logging of that level is disabled, the code is never tested, and trying to enable logging later might introduce an error when evaluating the arguments or formatting them into the message. Also, if logging of that level is disabled, side-effects might not happen.

    To avoid this, do one of:

    • never use the if-style deferring, internally or externally. Instead, squelch the I/O only. This can have a significant performance cost (especially at the DEBUG level), which is why the API is made in the first place.
    • ensure that your type system can statically verify that runtime errors are impossible in the conditional block. This requires that you are using a sane language and logging library.
    • run your testsuite at every log level, ensure 100% coverage of log code, and hope that the inevitable logic bug doesn’t have an unexpected dynamic failure.


  • Let’s introduce terms here: primarily, we’re plotting “combat power” as a function of “progress level”. Both of these are explained below.

    I assume we’re speaking about a level system that scales indefinitely. If there is a very small level cap it’s not important that all this math actually be done in full (though it doesn’t); balancing of the constants is more important in that case.

    The main choice to be made is whether this function is polynomial or exponential; the second choice is the exponent or base, respectively (in either case, call it N). Note that subexponential (but superpolynomial) functions exist, but are hard to directly reason about; one purpose of the “progress level” abstraction is to make that easier.

    Often there are some irregularities at level 1, but if we ignore those:

    • in a polynomial system, if N level-1 characters can fight equally to 1 level-2 character, then N level-10 characters can fight equally to 1 level-20 character.
    • in an exponential system, if N level-1 characters can fight equally to 1 level-2 character, then N level-19 characters can fight equally to 1 level-20 character.

    The third choice is whether to use absolute scale or relative scale. The former satisfies the human need for “number go up”, but if your system is exponential it implies a low level cap and/or a low base, unless you give up all sanity and touch floats (please don’t). The latter involves saying things like “attacks on somebody one level higher are only half as effective; on someone one level lower it’s twice as effective”, just be sure to saturate on overflow (or always work at the larger level’s scale, or declare auto-fail or auto-pass with a sufficient level difference, or …).

    Note that relative scale is often similar to how XP works even if everything else uses absolute scale, but mostly I’m not talking about XP here (though it is relevant for “is it expected for people actually to hit the hard level cap, or are you relying on a soft cap?”).


    Progress level is purely a utility abstraction. If you have exponential tiers (e.g. if you make a big deal about displayed levels 10, 100, 1000 or 4, 16, 64, 256) you might choose to set it to the log of the displayed level (this often matches human intuition, which is bad at math), though not necessarily since it might unnecessarily complicate the math with the cancelling you might do.

    If you want xianxia-style “punching up is harder the higher you go” (and you’re actually), it might be some superlinear function of character level (quadratic? exponential?).

    Otherwise it is just the displayed character level (in some contexts it is useful to consider this as a decimal, including the fraction of XP you’ve got to the next level. Or maybe not a fraction directly if you want to consider the increasing XP requirements, though for planning math it usually isn’t that important).


    Combat power is usually a product of up to 6 components, in rough order of importance:

    • effective health (including shields and damage resistance, if applicable - the latter is often hyperbolic!)
    • attack damage (including math for crits - though if they’re a constant max multiple you can ignore them)
    • attack chance (as modified by enemy dodge). Often this is bounded (e.g. min 5%, max 95%) so you get weirdos trying to use peasants to kill a
    • number of targets that can (practically) be hit with a single attack. Usually this involves compromises so might be ignored.
    • distance at which the attack can be performed. Usually this involves compromises so might be ignored.
    • how long you can keep fighting effectively without a rest (regeneration is relevant for this, whether innate, from items, or from spells). This is only relevant for certain kinds of games.

    So the net result is usually somewhere between a quadratic and a cubic system relative to the scaling of the individual components. If the individual scaling is exponential it’s common to just ignore the polynomial here though.

    Things like “stats” and “skills” are only relevant insomuch as they apply to the individual components. One other thing to think about is “how effective is a buff applied by a high-level character to a low-level character, and vice versa”, which is similar to the motivation of level requirements for gear.


  • All of these can be done with raw strings just fine.

    For the first pathlib bug case, PATH-like lookup is common, not just for binaries but also data and conf files. If users explicitly request ./foo they will be very upset if your program instead looks at /defaultpath/foo. Also, God forbid you dare pass a Path("./--help") to some program. If you’re using os.path.dirname this works just fine.

    For the second pathlib bug case, dir/ is often written so that you’ll cause explicit errors if there’s a file by that name. Also there are programs like rsync where the trailing slash outright changes the meaning of the command. Again, os.path APIs give you the correct result.

    For the article mistake, backslash is a perfectly legal character in non-Windows filenames and should not be treated as a directory component separator. Thankfully, pathlib doesn’t make this mistake at least. OTOH, / is reasonable to treat as a directory component separator on Windows (and some native APIs already handle it, though normalization is always a problem).

    I also just found that the pathlib.Path constructor ignores extra kwargs. But Python has never bothered much with safety anyway, and this minor compared to the outright bugs the other issues cause.





  • I’ve done something similar. In my case it was a startup script that did something like the following:

    • poll github using the search API for PR labels (note that this has sometimes stopped returning correct results, but …).
      • always do this once at startup
      • you might do this based on notifications; I didn’t bother since I didn’t need rapid responsiveness. Note that you should not do this for the specific data from a notification though; it’s only a way to wake up the script.
      • but no matter what, you should do this after N minutes, since notifications can be lost.
    • perform a git fetch for your main development branch (the one you perform the real merges to) and all pull/ refs (git does not do this by default; you’ll have to set them up for your local test repo. Note that you want to refer to the unmerged commits for these)
    • if the set of commits for all tagged PRs has not changed, wait and poll again
    • reset the test repo to the most recent commit from your main development branch
    • iterate over all PRs with the appropriate label:
      • ordering notes:
        • if there are commits that have previously tested successfully, you might do them first. But still test again since the merge order could be different. This of course depends on the level of tests you’re doing.
        • if you have PRs that depend on other PRs, do them in an appropriate order (perhaps the following will suffice, or maybe you’ll have some way of detecting this). As a rule we soft-forbid this though; such PRs should have been merged early.
        • finally, ordering by PR number is probably better than ordering by last commit date
      • attempt the merge (or rebase). If a nop, log that somewhere. If not clean, skip the PR for now (and log that), but only mark this as an error if it was the first PR you’ve merged (since if there’s a conflict it could be a prior PR’s fault).
      • Run pre-build stuff that might need to create further commits, build the product, and run some quick tests. If they fail, rollback the repo to the previous merge and complain.
      • Mark the commit as apparently good. Note that this is specifically applying to commits not PRs or branch names; I admit I’ve been sloppy above.
    • perform a pre-build, build and quick test again (since we may have rolled back and have a dirty build - in fact, we might not have ended up merging anything!)
    • if you have expensive tests, run them only here (and treat this as “unexpected early exit” below). It’s presumed that separate parts of your codebase aren’t too crazily entangled, so if a particular test fails it should be “obvious” which PR is relevant. Keep in mind that I used this system for assumed viable-work-in-progress PRs.
    • kill any existing instance and launch a new instance of the product using the build from the final merged commit and begin accepting real traffic from devs and beta users.
    • users connecting to the instance should see the log
    • if the launched instance exits unexpectedly within M minutes AND we actually ended up merging anything into the known-good branch, then reset to the main development branch (and build etc.) so that people at least have a functioning test server, but complain loudly in the MOTD when they connect to it. The condition here means that if it exits suddenly again the whole script goes up and starts again, which may be necessary if someone intentionally tried to kill the server to force a new merge sequence but it was too soon.
      • alternatively you could try bisecting the set of PR commits or something, but I never bothered. Note that you probably can’t use git bisect for this since you explicitly do not want to try commit from the middle of a PR. It might be simpler to whitelist or blacklist one commit at a time, but if you’re failing here remember that all tests are unreliable.