2 min read

Missing tournaments

Since 1958, six Grand Sumo tournaments (honbasho) have been held each year. source

Have there been any exceptions?

We’ll continue working with banzuke dataset.

Read banzuke.csv with hard-coded column types:

library(tidyverse)
df <- read_csv(
    "banzuke.csv",
    col_types = "ciccccDddcii"
)

This data set contains all tournaments since 1983 through 2020:

df %>% 
    pull(basho) %>% 
    unique()
##   [1] "1983.01" "1983.03" "1983.05" "1983.07" "1983.09" "1983.11" "1984.01"
##   [8] "1984.03" "1984.05" "1984.07" "1984.09" "1984.11" "1985.01" "1985.03"
##  [15] "1985.05" "1985.07" "1985.09" "1985.11" "1986.01" "1986.03" "1986.05"
##  [22] "1986.07" "1986.09" "1986.11" "1987.01" "1987.03" "1987.05" "1987.07"
##  [29] "1987.09" "1987.11" "1988.01" "1988.03" "1988.05" "1988.07" "1988.09"
##  [36] "1988.11" "1989.01" "1989.03" "1989.05" "1989.07" "1989.09" "1989.11"
##  [43] "1990.01" "1990.03" "1990.05" "1990.07" "1990.09" "1990.11" "1991.01"
##  [50] "1991.03" "1991.05" "1991.07" "1991.09" "1991.11" "1992.01" "1992.03"
##  [57] "1992.05" "1992.07" "1992.09" "1992.11" "1993.01" "1993.03" "1993.05"
##  [64] "1993.07" "1993.09" "1993.11" "1994.01" "1994.03" "1994.05" "1994.07"
##  [71] "1994.09" "1994.11" "1995.01" "1995.03" "1995.05" "1995.07" "1995.09"
##  [78] "1995.11" "1996.01" "1996.03" "1996.05" "1996.07" "1996.09" "1996.11"
##  [85] "1997.01" "1997.03" "1997.05" "1997.07" "1997.09" "1997.11" "1998.01"
##  [92] "1998.03" "1998.05" "1998.07" "1998.09" "1998.11" "1999.01" "1999.03"
##  [99] "1999.05" "1999.07" "1999.09" "1999.11" "2000.01" "2000.03" "2000.05"
## [106] "2000.07" "2000.09" "2000.11" "2001.01" "2001.03" "2001.05" "2001.07"
## [113] "2001.09" "2001.11" "2002.01" "2002.03" "2002.05" "2002.07" "2002.09"
## [120] "2002.11" "2003.01" "2003.03" "2003.05" "2003.07" "2003.09" "2003.11"
## [127] "2004.01" "2004.03" "2004.05" "2004.07" "2004.09" "2004.11" "2005.01"
## [134] "2005.03" "2005.05" "2005.07" "2005.09" "2005.11" "2006.01" "2006.03"
## [141] "2006.05" "2006.07" "2006.09" "2006.11" "2007.01" "2007.03" "2007.05"
## [148] "2007.07" "2007.09" "2007.11" "2008.01" "2008.03" "2008.05" "2008.07"
## [155] "2008.09" "2008.11" "2009.01" "2009.03" "2009.05" "2009.07" "2009.09"
## [162] "2009.11" "2010.01" "2010.03" "2010.05" "2010.07" "2010.09" "2010.11"
## [169] "2011.01" "2011.05" "2011.07" "2011.09" "2011.11" "2012.01" "2012.03"
## [176] "2012.05" "2012.07" "2012.09" "2012.11" "2013.01" "2013.03" "2013.05"
## [183] "2013.07" "2013.09" "2013.11" "2014.01" "2014.03" "2014.05" "2014.07"
## [190] "2014.09" "2014.11" "2015.01" "2015.03" "2015.05" "2015.07" "2015.09"
## [197] "2015.11" "2016.01" "2016.03" "2016.05" "2016.07" "2016.09" "2016.11"
## [204] "2017.01" "2017.03" "2017.05" "2017.07" "2017.09" "2017.11" "2018.01"
## [211] "2018.03" "2018.05" "2018.07" "2018.09" "2018.11" "2019.01" "2019.03"
## [218] "2019.05" "2019.07" "2019.09" "2019.11" "2020.01" "2020.03" "2020.07"
## [225] "2020.09" "2020.11"

Can we quickly check if any tournaments are missing?

Here’s one way:

1983:2020 %>% 
    cross2(
        sprintf(
            "%02d",
            0:5 * 2 + 1
        )
    ) %>% 
    map_chr(
        paste,
        collapse = "."
    ) %>% 
    setdiff(
        df$basho
    )
## [1] "2011.03" "2020.05"

There you have it — the two tournaments that never happened:

  • 2011 March was canceled due to a match-fixing scandal
  • 2020 May was canceled due to COVID-19 pandemic.