python - looping a function in a data frame -


my project: i'm creating elo rating tennis players, have 2 different data frames.

(1)a dataframe of players rating (2)a dataframe of matches ordered chronologically

working on database of matches retrive rating of both players , apply 2 functions (i have them defined) predicted_result(rating1, rating2), , updated_rating(rating1, rating2). first 1 gives me expected result of match given ratings, second 1 gives me updated ratings. need record updated ratings in player database.

i think i'm looking loop line line:

  • on first line of match dataframe retrives ratings from
    player database
  • runs both functions
  • replaces old rating updated rating in player database.

match dataframe

    winner    loser    0   nadal     federer    1   djokovic  verdasco    2   nadal     djokovic   3   del potro verdasco  

player dataframe

    player  rating    0   nadal     2320    1   djokovic  2280    2   verdasco  2120 3   federer   1890      4   del potro 1542  

i found answer below indicates how roll formula down, i'm missing how save updated ratings on player dataframe

rolling function on data frame

the principal issue here seems unhelpful format of ratings dataframe. since purpose of index make easy access rows index value, if make player name index problem becomes easier. since don't know how ratings calculated have assumed winning increases rating 1 point , losing reduces rating one.

first make sure i'm using same data :)

in [154]: ratings out[154]:       player  rating 0      nadal    2320 1   djokovic    2280 2   verdasco    2120 3    federer    1890 4  del potro    1542  in [155]: results out[155]:       winner     loser 0      nadal   federer 1   djokovic  verdasco 2      nadal  djokovic 3  del potro  verdasco 

next make copy of ratings table player index.

in [156]: ir = ratings.set_index(ratings["player"].values) 

i chose remove original "player" column, since redundant. ymmv.

in [157]: del ir["player"]  in [158]: ir out[158]:            rating nadal        2320 djokovic     2280 verdasco     2120 federer      1890 del potro    1542 

you can iterate on each column in results table:

in [159]: row in results["winner"]:    .....:         print(row)    .....: nadal djokovic nadal del potro 

so it's relatively update ratings:

in [160]: row in results["winner"]:    .....:         ir['rating'][row] += 1    .....:  in [161]: row in results["loser"]:    .....:         ir['rating'][row] -= 1    .....:  in [162]: ir out[162]:            rating nadal        2322 djokovic     2280 verdasco     2118 federer      1889 del potro    1543 

Comments

Popular posts from this blog

python - TypeError: start must be a integer -

c# - DevExpress RepositoryItemComboBox BackColor property ignored -

django - Creating multiple model instances in DRF3 -