New for this year's annual SCSUG Educational Forum is the Graduate Student Symposium.
Analytics is becoming a major tool for generating maximum value from data and supporting business decisions. The education and training of students in the methodology and software application is critical in filling the demand for such analytical expertise. This technical training must be accompanied by opportunities to enhance the soft skills as well.
Louisiana State University, Oklahoma State University and The University of Alabama are jointly hosting a Graduate Student Symposium in Analytics as a venue for communicating analytical ideas, sharing SAS software techniques, and learning from analytics professionals. The symposium will consist of nine student presenters (three from each university) and will cover a wide variety of analytics topics, including ways in which SAS software is used to analyze data. An industry expert from a sponsoring company will be assigned to each student presenter and will serve as a mentor, providing recommendations and positive commentary after the presentation on how the student’s work contributes to the body of knowledge on methodology and software application.
The Student Symposium will be held on Thursday, October 31, 2013, from 11:00am – 5:00pm in the Private Dining Room (PDR) at the OMNI.
See below for a listing of the scheduled abstracts. For more information, please contact Joni Shreve at jnunner@lsu.edu.
Click on a title to view abstract or click here to or Collapse All. Note that (*) indicates presenter.
Developing countries will play a significant role in the future demand of bandwidth, with projections of the largest growth in Asia. The increased usage of broadband and mobile broadband continues to push demand for bandwidth. Companies, such as Google, are committed to meeting this demand with Google Fiber technology that starts with a connection speed 100 times faster than today’s broadband. A recent study reported that doubling broadband speeds in OECD countries resulted in a .3% GDP increase ($126 billion). This presentation uses base SAS® and SAS® Enterprise Miner to develop predictive models that forecast the impact of broadband, in 29 countries, on Gross Domestic Product (GDP) for 2013-2017 based upon 2010-2012 GDP data from the International Monetary Fund (IMF), as well as, actual and forecasted broadband data from Business Monitor for 2010-2017. Base SAS was used to transform, sort, and merge the raw data and Enterprise Miner was used to transform the variables and build regression models to analyze the relationship between broadband and GDP. More people connected to the web means increased productivity, and investment in infrastructure by countries, which should lead to a positive impact on GDP. Projecting global broadband growth and the impact on GDP supports companies’ decisions to invest in broadband.
Approximately 80% of the world trade at present uses sea way with around 110,000 merchant vessels and 1.25 million marine-farers and almost 6 billion tons of goods every year .However, Marine Piracy increasingly pose a serious challenge to the sea trade. Goal of this research is to analyze pirate attacks using historic data in order to identify and categorize risk prone areas and suggest measures to prevent such attacks and reduce the loss of property and lives. Using data is obtained from IMB (International Maritime bureau), we analyze characteristics related to pirate’s choice of attack such as at attack location’s distance from the shore/port, type of vessel targeted, attack near which country’s shoreline, intensity of attacks and piracy trends over time. Exploratory analysis via SAS using graphs, histograms, plots, correlations as well as preliminary models using regressions and ANOVA will be presented. We hope our results will provide some useful information to the naval forces, shipmasters and the maritime organizations and Industry leaders.
SAS programmers often rely on SAS Enterprise Guide (EG) for exploring new data sets. Though SAS EG is very powerful because of its graphical interface, it can be very time consuming when the same analyses must applied to multiple variables. Furthermore, a great deal of output is produced that may not be relevant, and many procedures (PROC’s) and options are not available. This presentation will describe a new SAS program, named SimpleStream, which streamlines the exploratory phase of data analysis and is analogous to the Explore feature in SAS Enterprise Miner. SimpleStream’s set up is straightforward: load the program, insert the dataset path where indicated, and run. SimpleStream is coded in SAS 9.3, is completely portable and works with any SAS dataset. It consists of PROC and DATA steps for exploring the assumptions and data conditions necessary for later inferential analysis, especially linear regression: normality, homoscedacity, multicollinearity, linearity, and detection of outliers. Additionally, the SKIP macro is provided so that blocks of code can be ignored by SAS, allowing the user to tailor the analysis to specific needs, further streamlining the exploratory process. SimpleStream will be applied to the Boston Housing dataset where the goal is to predict the median home value of neighborhoods based upon 14 variables including age, crime rate and size of homes by neighborhood. This presentation will describe the exploratory process and illustrate that only the necessary graphs and statistical results are produced so that the analyst has the output directly pertinent to the decision making process.
Championships are achieved by excellent performance on the court. Measuring offensive and defensive performance provides the information that will improve efficiencies on the court. In Dean Oliver’s book, Basketball on Paper, he presents the ‘rules and tools for performance analysis’. In these analyses and presentation, Oliver’s offensive and defensive efficiency variables were used to evaluate and improve the efficiencies of The University of Alabama Basketball Team. Base SAS® and SAS® Enterprise Miner were used to perform regression analysis on 347 Division I basketball teams’ performance scores. Currently, Alabama is 139thoffensively and 21st defensively. After evaluating offensive efficiency; defensive efficiency; offensive and defensive effective field goal percentage; offensive and defensive turnover percentage; offensive and defensive – offensive rebounding percentage; and offensive and defensive free throw rate; the results indicated that Alabama needs to focus on average possession length. Efficiency was measured by calculating number of points per 100 possessions allowed defensively or scored offensively. One measure of success is increasing average possession length defensively; hence, Alabama needs to force the other team to hold the ball longer. This result may become part of Alabama’s on-court strategy as soon as this year, leading to potentially more efficient offensive and defensive output and our first national championship! ROLL TIDE!
Cardiovascular diseases (CVDs) are the number one cause of death globally: more people die annually from CVDs than from any other cause. An estimated 17.3 million people died from CVDs in 2008, representing 30% of all global deaths. Of these deaths, an estimated 7.3 million were due to coronary heart disease and 6.2 million were due to stroke. The mortality can be reduced if preventive steps are suggested by physicians to patients. Physicians take help of ECG reports to study the functioning of patient’s heart conditions. But the information from these reports is typically underutilized because no automated system exists to help physicians to narrow down the epicenter of the problem. ECG reports are available in pictorial/paper based electrocardiography data format and these reports are scanned into the system. These copies are converted to digital ECG signals. The important features of ECG signal, especially the QRST complex and associated intervals, are preserved by obtaining the contour from the paper ECG, using Matlab based tool [2]. This converted digital data will be analyzed via decision tree to explore if there is any effect of the of the octant variables on different MI Locations and outcome of the patient, whether they survived the stroke or not. Such as model can be used in preventive care and prioritize decisions by doctors.
The choice of academic major is one of the most important decisions facing incoming undergraduate students. From an administrative perspective, the choice is equally important in terms of funding and resource allocation, especially in a climate of budget constraints at public institutions. Historically, information systems (IS) programs have experienced considerable fluctuations in enrollment, therefore, this study will investigate factors that influence a student’s decision to major in ISDS at Louisiana State University. A 35-item survey was administered to undergraduate students enrolled in an introductory business statistics class aimed at measuring the perceptions of information technology and the IT profession as a whole. This data along with student characteristics will be used to determine whether or not a student will major in ISDS. In particular, this presentation will illustrate the use of SAS macros to calculate statistics necessary for two important steps in the data mining process: variable reduction and predictive modeling. Though factor and principal component analysis is widely used for variable reduction, this presentation will illustrate the Weight of Evidence (WOE) and Information Value (IV) approach which has been used in the credit industry and gaining popularity among other industries. Once the candidate predictors have been selected, this presentation will describe the results of the logistic regression analysis for model selection and the use of the K-S test for testing model strength and validation. Recommendations for student recruitment will be provided.
Natural disasters, property damage and insurance premiums are hot topics in Alabama after being classified the most Tornado-Prone State in 2011, with 177 reported tornados. The devastating tornadoes of April 25 to 28, 2011 caused billions of dollars in property damage and changed the way insurance companies evaluate storm risk. In the southern region of the state, hurricanes are in the spotlight. Residents may ask where is the safest place to live in the state, while insurance companies may ask where is the lowest disaster prone property damage area in the state. The homeowner’s insurance market is certainly focused on managing insurance premiums, while providing affordable coverage to the state’s residents. This presentation will discuss the changes in insurance rates/premiums for major insurance companies using home values from 13 major cities across three distinct regions: North, Central and South Alabama. Base SAS® and SAS® Enterprise Miner were used to analyze Alabama’s population data from the United States Census Bureau; wages and salaries data from the Federal Reserve Economic Data; and homeowner’s insurance premiums’ data from the Department of Insurance. The results of the analyses provide guidance in answering the questions of safest place to live and affordable insurance premiums across the state of Alabama.
Suicidal tendency among adolescent girls is a big challenge for the present day society. The main goal of this paper is to find associations of the various socio-emotional factors to suicidal tendencies among adolescent girls in the United States. The data were obtained from the National Longitudinal Study of Adolescent Health, to explore social behavior among adolescents. The observations from “National Longitudinal Study of Adolescent Health” represent a nationally representative sample of adolescents in grades 7 through 12 in the US. Students in each school were stratified by grade and sex. About 17 students were randomly chosen from each stratum so that a total of approximately 200 adolescents were selected from each of the 80 pairs of schools. The public access sample includes 6,504 adolescents. Models built via multiple regressions are used to find the association between depression, religious affiliation and suicidal tendency. ANOVA and Chi-Square tests are conducted to confirm the association of Religious affiliation, depression to suicidal tendency.
Whether you are an avid SAS programmer or someone who is interested in utilizing the remarkable power of SAS’s data manipulation, SAS Enterprise Guide (SAS EG) has undeniable benefits. The graphical user interface allows those who are not comfortable with programming SAS a way to use the power of SAS code and allows fast code building for experienced programmers. However, limitations exist in SAS EG: new procedures are not automated and some procedures in SAS EG are more difficult for beginning users. Using SAS’s Integration Technologies client, particularly SAS IOM and SAS Workspace Manager, two applications have been built to overcome the limitations mentioned above. Using Visual Basic (VB), aone application utilizes a new SAS procedure not available in SAS EG and a second application takes a complicated process and makes it accomplishable with a few easy clicks. The first program, Auto Imputer, creates a new dataset using SAS’s PROC MI, a multiple imputation procedure. This PROC is relatively new and has yet to be automated in SAS EG. The Auto Imputer application allows the user to easily use multiple imputation to fill in missing data. The second application, Black Box Analytics, uses multiple SAS procedures and VB objects to find the best variable subset in a linear regression model. The criteria for best fit are Adjusted R2 and validated Mean Squared Error. These applications prove that combining the object-oriented style of VB and SAS’s powerful data manipulation, one can do amazing things that neither language could do alone.