TY - JOUR AU - Roberts, Gareth O. AU - Rosenthal, Jeffrey S. TI - Football group draw probabilities and corrections JF - CANADIAN JOURNAL OF STATISTICS / REVUE CANADIENNE DE STATISTIQUE J2 - CAN J STAT PY - 2023 SN - 0319-5724 DO - 10.1002/cjs.11798 UR - https://m2.mtmt.hu/api/publication/34314200 ID - 34314200 AB - This article considers the challenge of designing football group draw mechanisms, which have a uniform distribution over all valid draw assignments, but are also entertaining, practical and transparent. Although this problem is trivial in completely symmetric problems, it becomes challenging when there are draw constraints that are not exchangeable across each of the competing teams, so that symmetry breaks down. We explain how to simulate the FIFA sequential draw method and compute the nonuniformity of its draws by comparison with a uniform rejection sampler. We then propose several practical methods of achieving the uniform distribution while still using balls and bowls in a way which is suitable for a televised draw. The solutions can also be carried out interactively. The general methodology we provide can readily be transported to different competition draws and is not restricted to football events. LA - English DB - MTMT ER - TY - JOUR AU - Zhao, Yanyan AU - Sun, Lei TI - A stable and adaptive polygenic signal detection method based on repeated sample splitting JF - CANADIAN JOURNAL OF STATISTICS / REVUE CANADIENNE DE STATISTIQUE J2 - CAN J STAT PY - 2023 PG - 19 SN - 0319-5724 DO - 10.1002/cjs.11768 UR - https://m2.mtmt.hu/api/publication/33903782 ID - 33903782 N1 - Export Date: 28 November 2023 AB - Focusing on polygenic signal detection in high-dimensional genetic association studies of complex traits, we develop a stable and adaptive test for generalized linear models to accommodate different alternatives. To facilitate valid post-selection inference for high-dimensional data, our study here adheres to the original sample-splitting principle but does so repeatedly to increase stability of the inference. We show the asymptotic null distribution of the proposed test for both fixed and diverging numbers of variants. We also show the asymptotic properties of the proposed test under local alternatives, providing insights on why power gain attributed to variable selection and weighting can compensate for efficiency loss due to sample splitting. We support our analytical findings through extensive simulation studies and two applications. The proposed procedure is computationally efficient and has been implemented as the R package DoubleCauchy. LA - English DB - MTMT ER - TY - JOUR AU - Jiang, Cong AU - Wallace, Michael P. AU - Thompson, Mary E. TI - Dynamic treatment regimes with interference JF - CANADIAN JOURNAL OF STATISTICS / REVUE CANADIENNE DE STATISTIQUE J2 - CAN J STAT PY - 2022 PG - 34 SN - 0319-5724 DO - 10.1002/cjs.11702 UR - https://m2.mtmt.hu/api/publication/33244338 ID - 33244338 AB - Precision medicine describes health care where patient-level data are used to inform treatment decisions. Within this framework, dynamic treatment regimes (DTRs) are sequences of decision rules that take individual patient information as input data and then output treatment recommendations. DTR estimation from observational data typically relies on the assumption of no interference: i.e., the outcome of one individual is unaffected by the treatment assignment of others. However, in many social network contexts, such as friendship or family networks, and for many health concerns, such as infectious diseases, this assumption is questionable. We investigate the DTR estimation method of dynamic weighted ordinary least squares (dWOLS), which boasts of easy implementation and the so-called double-robustness property, but relies on the assumption of no interference. We define a network propensity function and build on it to establish an implementation of dWOLS that remains doubly robust under interference associated with network links. The method's properties are demonstrated via simulation and applied to data from the Population Assessment of Tobacco and Health (PATH) study to investigate cigarette dependence within two-person household networks. LA - English DB - MTMT ER - TY - JOUR AU - Fan, Yifan AU - Jiang, Binyan AU - Yan, Ting AU - Zhang, Yuan TI - Asymptotic theory in bipartite graph models with a growing number of parameters JF - CANADIAN JOURNAL OF STATISTICS / REVUE CANADIENNE DE STATISTIQUE J2 - CAN J STAT PY - 2022 PG - 24 SN - 0319-5724 DO - 10.1002/cjs.11735 UR - https://m2.mtmt.hu/api/publication/33244076 ID - 33244076 AB - Affiliation networks contain a set of actors and a set of events, where edges denote the affiliation relationships between actors and events. Here, we introduce a class of affiliation network models for modelling the degree heterogeneity, where two sets of degree parameters are used to measure the activeness of actors and the popularity of events, respectively. We develop the moment method to infer these degree parameters. We establish a unified theoretical framework in which the consistency and asymptotic normality of the moment estimator hold as the numbers of actors and events both go to infinity. We apply our results to several popular models with weighted edges, including generalized beta-, Poisson and Rayleigh models. Simulation studies and a realistic example that involves the Poisson model provide concrete evidence that supports our theoretical findings. LA - English DB - MTMT ER - TY - JOUR AU - Csorg, Miklos AU - Dawson, Donald A. AU - Nasri, Bouchra R. AU - Remillard, Bruno N. TI - A random walk through Canadian contributions on empirical processes and their applications in probability and statistics JF - CANADIAN JOURNAL OF STATISTICS / REVUE CANADIENNE DE STATISTIQUE J2 - CAN J STAT PY - 2022 PG - 27 SN - 0319-5724 DO - 10.1002/cjs.11730 UR - https://m2.mtmt.hu/api/publication/33158499 ID - 33158499 N1 - Funding Agency and Grant Number: Natural Sciences and Engineering Research Council of Canada; Fonds de recherche du Quebec - Nature et technologies; Fonds de recherche du Quebec - Sante; Mathematics for Public Health Funding text: The authors are grateful to the Guest Editor, Bruce Smith, and two anonymous referees for their comments and suggestions. Partial funding in support of this work was provided by the Natural Sciences and Engineering Research Council of Canada, the Fonds de recherche du Quebec - Nature et technologies, the Fonds de recherche du Quebec - Sante, and Mathematics for Public Health. AB - In this article, we present a review of important results and statistical applications obtained or generalized by Canadian pioneers and their collaborators, for empirical processes of independent and identically distributed observations, pseudo-observations, and time series. In particular, we consider weak convergence and strong approximations results, as well as tests for model adequacy such as tests of independence, tests of goodness-of-fit, tests of change point, and tests of serial dependence for time series. We also consider applications of empirical processes of interacting particle systems for the approximation of measure-valued processes. LA - English DB - MTMT ER - TY - JOUR AU - Ge, Xinyi AU - Peng, Yingwei AU - Tu, Dongsheng TI - A generalized single-index linear threshold model for identifying treatment-sensitive subsets based on multiple covariates and longitudinal measurements JF - CANADIAN JOURNAL OF STATISTICS / REVUE CANADIENNE DE STATISTIQUE J2 - CAN J STAT VL - 51 PY - 2022 IS - 4 SP - 1171 EP - 1189 PG - 19 SN - 0319-5724 DO - 10.1002/cjs.11737 UR - https://m2.mtmt.hu/api/publication/33389460 ID - 33389460 N1 - Export Date: 19 January 2024 Correspondence Address: Tu, D.; Departments of Mathematics and Statistics & Public Health Sciences and Canadian Cancer Trials Group, Canada; email: dtu@ctg.queensu.ca AB - Identification of a subset of patients who may be sensitive to a specific treatment is an important step towards personalized medicine. We consider the case where the effect of a treatment is assessed by longitudinal measurements, which may be continuous or categorical, such as quality of life scores assessed over the duration of a clinical trial. We assume that multiple baseline covariates, such as age and expression levels of genes, are available, and propose a generalized single-index linear threshold model to identify the treatment-sensitive subset and assess the treatment-by-subset interaction after combining these covariates. Because the model involves an indicator function with unknown parameters, conventional procedures are difficult to apply for inferences of the parameters in the model. We define smoothed generalized estimating equations and propose an inference procedure based on these equations with an efficient spectral algorithm to find their solutions. The proposed procedure is evaluated through simulation studies and an application to the analysis of data from a randomized clinical trial in advanced pancreatic cancer. LA - English DB - MTMT ER - TY - JOUR AU - Susko, E. TI - Complex statistical modelling for phylogenetic inference JF - CANADIAN JOURNAL OF STATISTICS / REVUE CANADIENNE DE STATISTIQUE J2 - CAN J STAT VL - 50 PY - 2022 IS - 4 SP - 1339 EP - 1354 PG - 16 SN - 0319-5724 DO - 10.1002/cjs.11741 UR - https://m2.mtmt.hu/api/publication/33317621 ID - 33317621 N1 - Export Date: 12 December 2022 Correspondence Address: Susko, E.; Department of Mathematics and Statistics, Canada; email: edward.susko@gmail.com LA - English DB - MTMT ER - TY - JOUR AU - Douwes-Schultz, Dirk AU - Sun, Shuo AU - Schmidt, Alexandra M. AU - Moodie, Erica E. M. TI - Extended Bayesian endemic-epidemic models to incorporate mobility data into COVID-19 forecasting JF - CANADIAN JOURNAL OF STATISTICS / REVUE CANADIENNE DE STATISTIQUE J2 - CAN J STAT VL - 50 PY - 2022 IS - 3 SP - 713 EP - 733 PG - 21 SN - 0319-5724 DO - 10.1002/cjs.11723 UR - https://m2.mtmt.hu/api/publication/33177590 ID - 33177590 AB - Forecasting the number of daily COVID-19 cases is critical in the short-term planning of hospital and other public resources. One potentially important piece of information for forecasting COVID-19 cases is mobile device location data that measure the amount of time an individual spends at home. Endemic-epidemic (EE) time series models are recently proposed autoregressive models where the current mean case count is modelled as a weighted average of past case counts multiplied by an autoregressive rate, plus an endemic component. We extend EE models to include a distributed-lag model in order to investigate the association between mobility and the number of reported COVID-19 cases; we additionally include a weekly first-order random walk to capture additional temporal variation. Further, we introduce a shifted negative binomial weighting scheme for the past counts that is more flexible than previously proposed weighting schemes. We perform inference under a Bayesian framework to incorporate parameter uncertainty into model forecasts. We illustrate our methods using data from four US counties. LA - English DB - MTMT ER - TY - JOUR AU - Ebner, Bruno AU - Henze, Norbert AU - Strieder, David TI - Testing normality in any dimension by Fourier methods in a multivariate Stein equation JF - CANADIAN JOURNAL OF STATISTICS / REVUE CANADIENNE DE STATISTIQUE J2 - CAN J STAT PY - 2021 PG - 42 SN - 0319-5724 DO - 10.1002/cjs.11670 UR - https://m2.mtmt.hu/api/publication/32931741 ID - 32931741 AB - We study a novel class of affine-invariant and consistent tests for multivariate normality. The tests are based on a characterization of the standard d-variate normal distribution by way of the unique solution of an initial value problem connected to a partial differential equation, which is motivated by a multivariate Stein equation. The test criterion is a suitably weighted L-2-statistic. We derive the limit distribution of the test statistic under the null hypothesis as well as under contiguous and fixed alternatives to normality. A consistent estimator of the limiting variance under fixed alternatives, as well as an asymptotic confidence interval of the distance of an underlying alternative with respect to the multivariate normal law, is derived. In simulation studies, we show that the tests are strong in comparison with prominent competitors and that the empirical coverage rate of the asymptotic confidence interval converges to the nominal level. We present a real data example and also outline topics for further research. LA - English DB - MTMT ER - TY - JOUR AU - Zhang, Xinlian AU - Datta, Gauri S. AU - Ma, Ping AU - Zhong, Wenxuan TI - Bayesian spline smoothing with ambiguous penalties JF - CANADIAN JOURNAL OF STATISTICS / REVUE CANADIENNE DE STATISTIQUE J2 - CAN J STAT PY - 2021 PG - 16 SN - 0319-5724 DO - 10.1002/cjs.11655 UR - https://m2.mtmt.hu/api/publication/32396311 ID - 32396311 AB - A popular method for flexible function estimation in nonparametric models is the smoothing spline. When applying the smoothing spline method, the nonparametric function is estimated via penalized least squares, where the penalty imposes a soft constraint on the function to be estimated. The specification of the penalty functional is usually based on a set of assumptions about the function. Choosing a reasonable penalty function is the key to the success of the smoothing spline method. In practice, there may exist multiple sets of widely accepted assumptions, leading to different penalties, which then yield different estimates. We refer to this problem as the problem of ambiguous penalties. Neglecting the underlying ambiguity and proceeding to the model with one of the candidate penalties may produce misleading results. In this article, we adopt a Bayesian perspective and propose a fully Bayesian approach that takes into consideration all the penalties as well as the ambiguity in choosing them. We also propose a sampling algorithm for drawing samples from the posterior distribution. Data analysis based on simulated and real-world examples is used to demonstrate the efficiency of our proposed method. LA - English DB - MTMT ER -