Publications

[2011] [2010] [2009] [2008] [2007] [2006] [2005] [2004] [2003] [2001] [2000] [1999] [1998] [1997]
2012
  • Sofus A. Macskassy (2012). Characterizing Retweeting Behaviors in Twitter: On the use of Text vs. Concepts. Workshop on Collective Learning and Inference on Structured Data (CoLISD), at ECML/PKDD 2012. [pdf]
  • Sofus A. Macskassy (2012). Mining Dynamic Networks: The Importance of Pre-processing on Downstream Analytics. The Second International Workshop on Mining Communities and People Recommenders (COMMPER), at ECML/PKDD 2012. [pdf]
  • Sofus A. Macskassy (2012). On the Study of Social Interactions in Twitter. Proceedings of the Sixth International Conference on Weblogs and Social Media (ICWSM), 2012. [pdf]
    2011
  • Sofus A. Macskassy (2011). Relational Classifiers in a Non-relational world: Using Homophily to Create Relations. The Tenth International Conference on Machine Learning and Applications, 2011. [pdf]
  • Steve Minton, Matthew Michelson, Kane See, Sofus A. Macskassy, Bora C. Gazen, and Lise Getoor (2011). Improving Classifier Performance by Autonomously Collecting Background Knowledge from the Web. The Tenth International Conference on Machine Learning and Applications, 2011. [pdf]
  • Sofus A. Macskassy (2011). Contextual Linking Behavior of Bloggers: Leveraging text-mining to enable topic-based analysis. In Social Network Analysis and Mining, Volume 1, Number 4, 355-375. The official published paper is available online at http://www.springerlink.com/openurl.asp?genre=article&id=doi:10.1007/s13278-011-0026-8. DOI:10.1007/s13278-011-0026-8. [pdf]
  • Steve Minton, Sofus A. Macskassy, Peter LaMonica, Kane See, Craig A. Knoblock, Greg Barish, Matthew Michelson and Raymond Liuzzi (2011). Monitoring Entities in an Uncertain World: Entity Resolution and Referential Integrity. In the Twenty-Third Annual Conference on Innovative Applications of Artificial Intelligence (IAAI), 2011. [pdf]
  • Sofus A. Macskassy and Matthew Michelson (2011). Why do People Retweet? Anti-Homophily Wins the Day!. In the Fifth International Conference on Weblogs and Social Media (ICWSM), 2011. [pdf]
  • Matthew Michelson and Sofus A. Macskassy (2011). What Blogs Tell Us about Websites: A Demographic Study. In the Proceedings of the Fourth ACM International Conference in Web Search and Data Mining (WSDM), Hong Kong, 2011. [pdf]
  • Matthew Michelson, Sofus A. Macskassy, Steve Minton and Lise Getoor (2011). Materializing Multi-Relational Databases from the Web using Taxonomic Queries. In the Proceedings of the Fourth ACM International Conference in Web Search and Data Mining (WSDM), Hong Kong, 2011. [pdf]
    2010
  • Matthew Michelson and Sofus A. Macskassy (2010). Discovering Users' Topics of Interest on Twitter: A First Look. Proceedings of the Workshop on Analytics for Noisy, Unstructured Text Data (AND), Toronto, Canada, 2010. Toolkit(EntityExplorer) available here. [pdf]
  • Sofus A. Macskassy and Matthew Michelson (2010). Linking in Social Media Does Not a Community Make. Proceedings of the Workshop on Information in Networks (WIN-2010). [pdf]
  • Sofus A. Macskassy (2010). Leveraging contextual information to explore posting and linking behaviors of bloggers. Proceedings of the 2010 International Conference on Advances in Social Networks Analysis and Mining (ASONAM-2010). [pdf]
  • Matthew Michelson, Sofus A. Macskassy and Steve Minton (2010). Mixed-Initiative, Entity-Centric Data Aggregation using Assistopedia. Proceedings of the AAAI Workshop on Collaboratively-built Knowledge Sources and Artificial Intelligence (WikiAI), Atlanta, GA, 2010. [pdf]
  • Matthew Michelson and Sofus A. Macskassy (2010). An Efficient Sequential Covering Algorithm for Explaining Subsets of Data. Proceedings of the 2010 International Conference on Artificial Intelligence (ICAI). [pdf]
    2009
  • Sofus A. Macskassy (2009). The many faces of guilt-by-association. Proceedings of the Workshop on Information in Networks (WIN-2009). [pdf]
  • Sofus A. Macskassy (2009). Using Graph-based Metrics with Empirical Risk Minimization to Speed Up Active Learning on Networked Data. Proceedings of the 15th ACM SIGKDD Conference On Knowledge Discovery and Data Mining, 2009. [pdf]
  • Matthew Michelson and Sofus A. Macskassy (2009). Layered, Multivariate Anomaly Explanations: A First Look. Proceedings of the International Workshop on Statistical Relational Learning, (SRL-2009). [pdf]
  • Shefali Sharma and Sofus A. Macskassy (2009). Ranking Techniques for Cluster Based Search Results in a Textual Knowledge-base. Proceedings of the 2009 International Conference on Artificial Intelligence (ICAI). [pdf]
  • Matthew Michelson and Sofus A. Macskassy (2009). Judging the Performance of Cascading Models: A First Look. Proceedings of the Fourth workshop on evaluation methods in machine learning (2009). [pdf]
  • Matthew Michelson and Sofus A. Macskassy (2009). Record Linkage Measures in an Entity Centric World. Proceedings of the Fourth workshop on evaluation methods in machine learning (2009). [pdf]
  • Matthew Michelson, Sofus A. Macskassy and Steven N. Minton (2009). Flexible query formulation for federated search. Proceedings of the Seventh International Workshop on Information Integration on the Web (IIWeb 2009). [pdf]
    2008
  • Sofus A. Macskassy and Claude C. Nanjo (2008). Graph Mining using Graph Pattern Profiles. Proceedings of the 2008 International Conference on Artificial Intelligence (ICAI). [pdf]
  • Sofus A. Macskassy and Evan S. Gamble (2008). Data Mining in the Context of Entity Resolution. Workshop on Data Mining for Business Applications at the 14th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (KDD). [pdf]
  • Paul Tetlock, Maytal Saar-Tsechansky, and Sofus A. Macskassy (2008). More Than Words: Quantifying Language to Measure Firms' Fundamentals. Journal of Finance, 63(3), pages 1437-1467, June 2008. [pdf]
    2007
  • Sofus A. Macskassy (2007). Improving Learning in Networked Data by Combining Explicit and Mined Links. Proceedings of the Twenty-Second Conference on Artificial Intelligence (AAAI-2007), July 22-26, 2007, Vancouver, Canada. [ps] [pdf]
  • Sofus A. Macskassy (2007). Improving Within-Network Classification with Local Attributes. Workshop on Text-Mining and Link Analysis (Textlink) at the Twentieth International Joint Conference on Artificial Intelligence, January 7, 2007, Hydarabad, India. [ps] [pdf]
  • Evan S. Gamble, Sofus A. Macskassy, Steve Minton (2007). Classification with Pedigree and its Applicability to Record Linkage. Workshop on Text-Mining and Link Analysis (Textlink) at the Twentieth International Joint Conference on Artificial Intelligence, January 7, 2007, Hydarabad, India. [ps] [pdf]
  • Sofus A. Macskassy, Foster Provost (2007). Classification in Networked Data: A toolkit and a univariate case study. Journal of Machine Learning, 8(May):935-983, 2007. (this is the journal version of the CeDER-04-08 technical report below).
    Data files used in this paper (formatted for the latest version of Netkit): NetKit-Data.zip (1.3Mb).
    [pdf]
    2006
  • Sofus A. Macskassy, and Foster Provost (2006). A brief survey of machine learning methods for classification in networked data and an application to suspicion scoring. E.M. Airoldi et al. (Eds.): ICML 2006 Ws, LNCS 4503, pp. 172-175. Springer-Verlag. [pdf].
    Originally appeared as a poster at the Workshop on Statistical Network Learning at 23rd International Conference on Machine Learning (ICML 2006), Pittsburgh, 29 June, 2006.
    [pdf]
    2005
  • Sofus A. Macskassy, Foster Provost, and Saharon Rosset (2005). ROC Confidence Bands: An Empirical Evaluation. In Proceedings of the 22nd International Conference on Machine Learning (ICML 2005), Bonn, Germany, 7-11 August, 2005. (this also will appear in Proceedings of the Second workshop on ROC Analysis in ML, at ICML-2005). [ps] [pdf]
  • Sofus A. Macskassy, Foster Provost, and Saharon Rosset (2005). Pointwise ROC Confidence Bounds: An Empirical Evaluation. In Proceedings of the Second workshop on ROC Analysis in ML, at the 22nd International Conference on Machine Learning. [ps] [pdf]
  • Sofus A. Macskassy, Foster Provost (2005). Suspicion scoring based on guilt-by-association, collective inference, and focused data access. In Proceedings of the NAACSOS Conference 2005. June 2005. (this is a follow-up paper to the International Conference on Intelligence Analysis paper below with the similar title). [ps] [pdf]
  • Sofus A. Macskassy, Foster Provost (2005). NetKit-SRL: A Toolkit for Network Learning and Inference. In Proceedings of the NAACSOS Conference 2005. June 2005. [ps] [pdf]
  • Sofus A. Macskassy, Foster Provost (2005). Suspicion scoring based on guilt-by-association, collective inference, and focused data access. In Proceedings of the International Conference on Intelligence Analysis. May 2005. [ps] [pdf]
    2004
  • Sofus A. Macskassy (2004). Significance Testing against the Random Model for Scoring Models on Top k Predictions. CeDER Working Paper #CeDER-05-09, Stern School of Business, New York University, NY, NY 10012. December 2004. [ps] [pdf]
  • Sofus A. Macskassy, Foster Provost (2004). Classification in Networked Data: A toolkit and a univariate case study. CeDER Working Paper #CeDER-04-08, Stern School of Business, New York University, NY, NY 10012. December 2004. Updated December 2006. This is the technical report version of the JMLR paper above. [ps] [pdf]
  • Sofus A. Macskassy, Foster Provost (2004). Confidence Bands for ROC Curves: Methods and an Empirical Study. In Proceedings of the First Workshop on ROC Analasis in AI (ROCAI-2004) at ECAI-2004. August 2004. [ps] [pdf]
  • Sofus A. Macskassy, Foster Provost (2004). Simple Models and Classification in Networked Data. CeDER Working Paper 03-04, Stern School of Business, New York University, NY, NY 10012. 2004. [ps] [pdf]
    2003
  • Sofus A. Macskassy, Haym Hirsh (2003). Adding Numbers to Text Classification. In Proceedings of the Twelfth International Conference on Information and Knowledge Management (CIKM 2003). [ps] [pdf]
  • Foster Provost, Claudia Perlich, and Sofus A. Macskassy (2003). Relational Learning Problems and Simple Models.. In Proceedings of the IJCAI-2003 Workshop on Learning Statistical Models from Relational Data.
  • Sofus A. Macskassy, Foster Provost (2003). A Simple Relational Classifier. In the 2nd Workshop on Multi-Relational Data Mining (MRDM-2003) at KDD-2003. [ps] [pdf]
  • Claudia Perlich, Foster Provost, and Sofus A. Macskassy (2003). Predicting citation rates for physics papers: Constructing features for an ordered probit model.. In SIGKDD Explorations 5(2), 2003, 154-155.
  • Sofus A. Macskassy, Haym Hirsh, Foster Provost (2003). Intelligent Information Filtering: Learning Prospective User Profiles. Invited talk at the Joint Statistical Meeting topic contributed session on "Know Your Customer: User Profiling for CRM and Intrusion/Fraud Detection". [Compressed powerpoint slides].
  • Sofus A. Macskassy, Foster Provost, Michael L. Littman (2003). Confidence Bands for ROC Curves. CeDER Working Paper IS-03-04, Stern School of Business, New York University, NY, NY 10012. [ps] [pdf]
  • Sofus A. Macskassy (2003). New Techniques in Information Filtering. Ph.D. dissertation. Department of Computer Science, Rutgers University, New Brunswick, NJ. 2003. [ps] [pdf]
  • Sofus A. Macskassy, Haym Hirsh, Arunava Banerjee and Aynur A. Dayanik (2003). Converting Numerical Classification into Text Classification. Artificial Intelligence, 143(1):51-77, January 2003. [ps] [pdf]
    2001
  • Sofus A. Macskassy, Haym Hirsh, Foster Provost, Ramesh Sankaranarayanan and Vasant Dhar (2001). Intelligent Information Triage. © ACM, 2001. This is the author's version of the work. It is posted here by permission of ACM for your personal use. Not for redistribution. The definitive version was published in 24th Annual International Conference on Research and Development in Information Retrieval (SIGIR-2001), pages 318-326. (http://doi.acm.org/10.1145/383952.384015) [ps] [pdf]
  • Sofus A. Macskassy, Haym Hirsh, Arunava Banerjee and Aynur A. Dayanik (2001). Using Text Classifiers for Numerical Classification. Seventeenth International Joint Conference on Artificial Intelligence (IJCAI-2001). [ps] [pdf]
  • Sofus A. Macskassy, Haym Hirsh, Foster Provost, Ramesh Sankaranarayanan and Vasant Dhar (2001). Information Triage using Prospective Criteria. 8th International Conference on User Modeling (UM-2001) Workshop on Machine Learning, Information Retrieval and User Modeling. [zipped PowerPoint slides from the talk]. [ps] [pdf]
  • Sofus A. Macskassy (2001). Intelligent Information Triage: Learning to Prioritize by Integrating Multiple Sources of Information. a Student Scholarship Poster in the Eighteenth International Conference on Machine Learning (ICML-2001). [zipped PowerPoint postscript slides].
    2000
  • Sofus A. Macskassy, Aynur A. Dayanik and Haym Hirsh (2000). Information Valets for Intelligent Information Access. AAAI Spring Symposia Series on Adaptive User Interfaces, (AUI-2000). [ps] [pdf]
    1999
  • Sofus A. Macskassy, Aynur A. Dayanik and Haym Hirsh (1999). EmailValet: Learning User Preferences for Wireless Email. IJCAI-99 workshops: Learning About Users and Machine Learning for Information Filtering, 1999.
    Slides for the talk are available in powerpoint as ijcai1999-slides.ppt (162Kb, needs to be able to view EPS files) and as compressed postscript ijcai1999-slides.ps.gz (85Kb).
    [ps] [pdf]
  • Sofus A. Macskassy, Aynur A. Dayanik and Haym Hirsh (1999). EmailValet: Learning Email Preferences for Wireless Platforms. Seventh International Conference on User Modeling workshop Machine Learning for User Modeling, (UM-1999). [ps] [pdf]
    1998
  • Sofus A. Macskassy, Arunava Banerjee, Brian D. Davison and Haym Hirsh (1998). Human Performance on Clustering Web Pages: A Preliminary Study. Poster at The Fourth International Conference on Knowledge Discovery and Data Mining, (KDD-1998).
    (A longer version is available as a technical report DCS-TR-355.)
    [ps] [pdf]
  • Sofus A. Macskassy, Arunava Banerjee, Brian D. Davison and Haym Hirsh (1998). Human Performance on Clustering Web Pages. Technical Report, DCS-TR-355, Department of Computer Science, Rutgers University, August 1998.
    (A shorter version appeared as a poster in The Fourth International Conference on Knowledge Discovery and Data Mining.)
    [ps] [pdf]
    1997
  • Sofus A. Macskassy and Leon Shklar (1997). Maintaining information resources. Proceedings of the Third International Workshop on Next Generation Information Technologies (NGITS'97), June 30-July 3, 1997, Neve Ilan, Israel. [ps] [pdf]
    Unpublished Manuscripts:
  • Sofus A. Macskassy, Aynur A. Dayanik and Haym Hirsh (2000). EmailValet: Where do you want to read your Email?. [ps] [pdf]
  • Aynur A. Dayanik, Sofus A. Macskassy and Haym Hirsh (2000). Binning: Converting Numerical Classification into Text Classification. [ps] [pdf]
  • Sofus A. Macskassy, Aynur A. Dayanik and Haym Hirsh (2000). Information Valets: Adaptivity for Multi-Platform Access to Heterogeneous Information. [ps] [pdf]
  • Sofus A. Macskassy (1998). A Comparison of Two On-line Algorithms that Adapt to Concept Drift. [ps] [pdf]
  • Sofus A. Macskassy and Leon Shklar (1998). Maintaining Information Resources: Experimental Studies.
  • Sofus A. Macskassy (1996). A Conversational Agent. my master essay from Spring 1996, approved by Dr. Suzanne Stevenson. [ps] [pdf]