diff --git a/search/search_index.json b/search/search_index.json
index 0d4419a..cbe4f8f 100644
--- a/search/search_index.json
+++ b/search/search_index.json
@@ -1 +1 @@
-{"config":{"lang":["en"],"min_search_length":3,"prebuild_index":false,"separator":"[\\s\\-]+"},"docs":[{"location":"","text":"OSG School 2024 \u00b6 Could you transform your research with vast amounts of computing? Learn how this summer at the lovely University of Wisconsin\u2013Madison During the School, August 5\u20139 , you will learn to use high-throughput computing (HTC) systems \u2014 at your own campus or using the national-scale Open Science Pool \u2014 to run large-scale computing applications that are at the heart of today\u2019s cutting-edge science. Through lectures, discussions, and lots of hands-on activities with experienced OSG staff, you will learn how HTC systems work, how to run and manage lots of jobs and huge datasets to implement a scientific computing workflow, and where to get more information and help. The school is ideal for: Researchers (especially graduate students and post-docs) in any research area for which large-scale computing is a vital part of the research process; Anyone (especially students and staff) who supports researchers who are current or potential users of high-throughput computing; Instructors (at the post-secondary level) who teach future researchers and see value in integrating high-throughput computing into their curriculum. People accepted to this program will receive financial support for basic travel and local costs associated with the School. Applications \u00b6 Applications are now closed for 2024. The deadline for applications was Monday, 1 April 2024. If still needed, have someone email a letter of recommendation for you to school@osg-htc.org (ideally PDF or plain text) For the letter of recommendation, ask someone who knows you professionally \u2014 ideally a faculty member or other supervisor. They should clearly identify your name and the \u201cOSG School 2024\u201d in the subject line and letter, so that we can associate your application and letter. Applicants: We plan to review applications in April and invite participants by early May or so. We will contact you once decisions have been made. Thank you for your patience! Contact Us \u00b6 The OSG School is the premier training event of the OSG Consortium and is held annually at UW\u2013Madison. If you have any questions about the event, feel free to email us: school@osg-htc.org OSGSchool * Image provided by Wikimedia user Av9 under Creative Commons License","title":"Home"},{"location":"#osg-school-2024","text":"Could you transform your research with vast amounts of computing? Learn how this summer at the lovely University of Wisconsin\u2013Madison During the School, August 5\u20139 , you will learn to use high-throughput computing (HTC) systems \u2014 at your own campus or using the national-scale Open Science Pool \u2014 to run large-scale computing applications that are at the heart of today\u2019s cutting-edge science. Through lectures, discussions, and lots of hands-on activities with experienced OSG staff, you will learn how HTC systems work, how to run and manage lots of jobs and huge datasets to implement a scientific computing workflow, and where to get more information and help. The school is ideal for: Researchers (especially graduate students and post-docs) in any research area for which large-scale computing is a vital part of the research process; Anyone (especially students and staff) who supports researchers who are current or potential users of high-throughput computing; Instructors (at the post-secondary level) who teach future researchers and see value in integrating high-throughput computing into their curriculum. People accepted to this program will receive financial support for basic travel and local costs associated with the School.","title":"OSG School 2024"},{"location":"#applications","text":"Applications are now closed for 2024. The deadline for applications was Monday, 1 April 2024. If still needed, have someone email a letter of recommendation for you to school@osg-htc.org (ideally PDF or plain text) For the letter of recommendation, ask someone who knows you professionally \u2014 ideally a faculty member or other supervisor. They should clearly identify your name and the \u201cOSG School 2024\u201d in the subject line and letter, so that we can associate your application and letter. Applicants: We plan to review applications in April and invite participants by early May or so. We will contact you once decisions have been made. Thank you for your patience!","title":"Applications"},{"location":"#contact-us","text":"The OSG School is the premier training event of the OSG Consortium and is held annually at UW\u2013Madison. If you have any questions about the event, feel free to email us: school@osg-htc.org OSGSchool * Image provided by Wikimedia user Av9 under Creative Commons License","title":"Contact Us"},{"location":"health/","text":"Health Guidelines \u00b6 The OSG School 2024 at the UW\u2013Madison welcomes participants from around the United States plus India, Mali, and Uganda. This page contains health guidelines for this year\u2019s School. While the focus is in COVID-19, most of these guidelines also apply to preventing the spread of other infectious illnesses (flu, colds, GI viruses, etc.). It is very important to us that everyone stays safe and healthy throughout the whole School. We will have the best event possible if everyone stays well! There are no hard rules here, just a reminder that we are all in this together . If you have any questions, concerns, or comments about these guidelines, please email us at school@osg-htc.org or message us on Slack. Before Traveling to the School \u00b6 If you tested positive for COVID recently (past 2 weeks or so), please follow CDC guidelines for what to do when sick. Even if you have no symptoms or known exposure, consider taking a rapid test before traveling to improve the odds that you are not bringing COVID to the event. If you DO test positive before the School, or if you do not feel well enough to travel for any reason, please let us know immediately so we can accommodate (see below for remote participation options). While in Madison \u00b6 Wearing a mask is welcome at the School itself when indoors or in other poorly ventilated areas. We can provide a few high-quality KN95 masks for people who would like them and have not brought their own. We encourage everyone to consider outdoor dining options when reasonable \u2014 not just for reducing risk, but also because Madison is beautiful in the summer! While in Madison, if you feel unwell, stay home or at the hotel. When you can, let School staff know why you are absent \u2014 by email or Slack \u2014 and if you would like to keep up with exercises and lectures, we will help support you remotely (see below). If you experience possible symptoms of COVID-19 , or test positive for COVID-19, follow CDC guidelines for what to do when sick. Remote Attendance \u00b6 If you are in Madison and are sick or quarantined, or if you are not able to travel to Madison, we will do our best to support you via remote attendance. We learned a lot about remote events during the pandemic! We can: Try to stream lectures live over Zoom Post all slides and exercises on the website Be active on Slack and email Conduct one-on-one consultations over Zoom As long as you feel up to it, we will do our best to support you during the School.","title":"Health Guidelines"},{"location":"health/#health-guidelines","text":"The OSG School 2024 at the UW\u2013Madison welcomes participants from around the United States plus India, Mali, and Uganda. This page contains health guidelines for this year\u2019s School. While the focus is in COVID-19, most of these guidelines also apply to preventing the spread of other infectious illnesses (flu, colds, GI viruses, etc.). It is very important to us that everyone stays safe and healthy throughout the whole School. We will have the best event possible if everyone stays well! There are no hard rules here, just a reminder that we are all in this together . If you have any questions, concerns, or comments about these guidelines, please email us at school@osg-htc.org or message us on Slack.","title":"Health Guidelines"},{"location":"health/#before-traveling-to-the-school","text":"If you tested positive for COVID recently (past 2 weeks or so), please follow CDC guidelines for what to do when sick. Even if you have no symptoms or known exposure, consider taking a rapid test before traveling to improve the odds that you are not bringing COVID to the event. If you DO test positive before the School, or if you do not feel well enough to travel for any reason, please let us know immediately so we can accommodate (see below for remote participation options).","title":"Before Traveling to the School"},{"location":"health/#while-in-madison","text":"Wearing a mask is welcome at the School itself when indoors or in other poorly ventilated areas. We can provide a few high-quality KN95 masks for people who would like them and have not brought their own. We encourage everyone to consider outdoor dining options when reasonable \u2014 not just for reducing risk, but also because Madison is beautiful in the summer! While in Madison, if you feel unwell, stay home or at the hotel. When you can, let School staff know why you are absent \u2014 by email or Slack \u2014 and if you would like to keep up with exercises and lectures, we will help support you remotely (see below). If you experience possible symptoms of COVID-19 , or test positive for COVID-19, follow CDC guidelines for what to do when sick.","title":"While in Madison"},{"location":"health/#remote-attendance","text":"If you are in Madison and are sick or quarantined, or if you are not able to travel to Madison, we will do our best to support you via remote attendance. We learned a lot about remote events during the pandemic! We can: Try to stream lectures live over Zoom Post all slides and exercises on the website Be active on Slack and email Conduct one-on-one consultations over Zoom As long as you feel up to it, we will do our best to support you during the School.","title":"Remote Attendance"},{"location":"schedule/","text":"August 4 (Sunday) \u00b6 Welcome Dinner for Participants and Staff All School participants and staff are encourage to attend! Time: Starting at 6:30 p.m. Location : Fluno Center , 601 University Avenue; Skyview Room, 8th floor August 5 (Monday) \u00b6 Start End Event Instructor 8:00 8:45 Breakfast in Computer Sciences 1240 - 9:00 9:15 Welcome to the OSG School Tim C. 9:15 9:30 Lecture: Introduction to High Throughput Computing Christina 9:30 9:45 Exercise: Scaling Out Computing Worksheet - 9:45 10:15 Lecture: Introduction to HTCondor Andrew 10:15 10:30 Exercise: Log in - 10:30 10:45 Break - 10:45 12:15 Exercises: HTCondor basics (1.n series) - 12:15 13:15 Lunch in Computer Sciences (near 1240) - 13:15 14:00 Lecture: More HTCondor Andrew 14:15 15:00 Exercises: Many jobs (2.n series) - 15:00 15:15 Break - 15:15 15:30 Lecture: Setting goals for the School and beyond Rachel 15:30 17:00 Exercises: Goals and unfinished exercises Individual consultations - 19:00 20:30 Evening work session (optional) Memorial Union \u2013 Council Room (4th Floor) Note: Free, outdoor showing of Jaws (1975) at 9 p.m.! Rachel, Christina, Tim August 6 (Tuesday) \u00b6 Start End Event Instructor 8:00 8:45 Breakfast in Computer Sciences 1240 - 9:00 9:45 Lecture: Introduction to dHTC and the OSPool Tim C. 9:45 10:30 Exercises: Using the OSPool - 10:30 10:45 Break Travel document collection, as needed - 10:45 11:30 Lecture: Troubleshooting jobs Showmic 11:30 12:15 Exercises: Basic troubleshooting tools - 12:15 13:30 Lunch in Computer Sciences (near 1240) 13:15: Return documents in 1240 - 13:30 14:45 Interactive: High Throughput Computing in action staff 14:45 15:00 Break - 15:00 15:45 Lecture: Software portability Rachel 15:45 17:00 Exercises: Software and unfinished exercises Individual consultations staff 19:00 20:30 Evening work session (optional) Memorial Union \u2013 Council Room (4th Floor) Christina, Amber, Tim August 7 (Wednesday) \u00b6 Start End Event Instructor 8:00 8:45 Breakfast in Computer Sciences 1240 - 9:00 9:45 Lecture: Working with data Andrew 9:45 10:45 Exercises: Data - 10:45 11:00 Break - 11:00 12:00 HTC Showcase Part 1 \u25b6 Michael Gerard ; Nuclear Engineering & Engineering Physics \u201cUsing CHTC to optimize the Helically Symmetric eXperiment stellarator\u201d \u25b6 Bryce Johnson ; Morgridge Institute for Research & UW\u2013Madison Computer Sciences \u201cRunning millions of biophysical simulations with OSPool\u201d - 12:00 12:30 Open Q&A and discussion time staff 12:30 13:45 Lunch in Computer Sciences (near 1240) Optional Domain Lunches: Christina (math); Rachel (biology); Andrew/Amber (chemistry); Ian (ML); Tim (physics & astronomy) - 13:45 17:00 Afternoon off \u2014 suggestions for fun Or, optional work time Or, individual consultations staff 19:00 20:30 Evening work session (optional) Memorial Union \u2013 Langdon Room (4th Floor) Christina, Showmic August 8 (Thursday) \u00b6 Start End Event Instructor 8:00 8:45 Breakfast in Computer Sciences 1240 - 9:00 9:45 Lecture: Independence in Research Computing Christina 9:45 10:45 Exercises: Scaling up - 10:45 11:00 Break - 11:00 12:00 Lecture: DAGMan Rachel 12:00 13:15 Lunch in Computer Sciences (near 1240) - 13:15 14:30 Exercises: DAGMan Work Time: Apply HTC to own research Individual consultations staff 14:30 14:45 Break - 14:45 15:45 Work Time: Apply HTC to own research Individual consultations staff 15:45 16:30 Lecture: Machine Learning Ian 19:00 20:30 Evening work session (optional) Memorial Union \u2013 Council Room (4th Floor) Andrew, Showmic, Tim August 9 (Friday) \u00b6 Start End Event Instructor 8:00 8:45 Breakfast in Computer Sciences 1240 - 9:00 9:45 Optional Lecture: Self-Checkpointing Work Time: Apply HTC to own research Showmic 9:45 10:30 Work time: Apply HTC to own research Individual consultations staff 10:30 10:45 Break - 10:45 11:30 Work time: Apply HTC to own research - 11:30 11:50 Group photo (details TBD) - 11:50 13:00 Lunch, Computer Sciences (Staff to direct) Optional: Introduction to Research Computing Facilitation Christina 13:00 14:00 HTC Showcase, Part 2 \u25b6 Dan Wright ; Civil & Environmental Engineering \u201cComputational hydroclimate research enabled by HTC\u201d \u25b6 Saloni Bhogale ; \u201cTBD\u201d - 14:00 14:30 Open Q&A Work time: Apply HTC to own research Break - 14:30 15:30 Lightning talks by volunteer participants Attendees 15:30 16:00 Open Q&A and work time staff 16:00 16:45 HTC and HTCondor Philosophy Miron (and Greg?) 16:45 17:15 Lecture: Forward Tim Closing Dinner for Participants and Staff Location: Union South (next to Computer Sciences), indoors/outdoors Time: 6:00\u20137:45 p.m. (or so) We have scheduled a buffet dinner to wrap up the School. The buffet itself, and some tables, are inside. But if the weather is good, we will have easy access to outdoor space, too, and we encourage everyone to head outside with their food.","title":"Schedule"},{"location":"schedule/#august-4-sunday","text":"Welcome Dinner for Participants and Staff All School participants and staff are encourage to attend! Time: Starting at 6:30 p.m. Location : Fluno Center , 601 University Avenue; Skyview Room, 8th floor","title":"August 4 (Sunday)"},{"location":"schedule/#august-5-monday","text":"Start End Event Instructor 8:00 8:45 Breakfast in Computer Sciences 1240 - 9:00 9:15 Welcome to the OSG School Tim C. 9:15 9:30 Lecture: Introduction to High Throughput Computing Christina 9:30 9:45 Exercise: Scaling Out Computing Worksheet - 9:45 10:15 Lecture: Introduction to HTCondor Andrew 10:15 10:30 Exercise: Log in - 10:30 10:45 Break - 10:45 12:15 Exercises: HTCondor basics (1.n series) - 12:15 13:15 Lunch in Computer Sciences (near 1240) - 13:15 14:00 Lecture: More HTCondor Andrew 14:15 15:00 Exercises: Many jobs (2.n series) - 15:00 15:15 Break - 15:15 15:30 Lecture: Setting goals for the School and beyond Rachel 15:30 17:00 Exercises: Goals and unfinished exercises Individual consultations - 19:00 20:30 Evening work session (optional) Memorial Union \u2013 Council Room (4th Floor) Note: Free, outdoor showing of Jaws (1975) at 9 p.m.! Rachel, Christina, Tim","title":"August 5 (Monday)"},{"location":"schedule/#august-6-tuesday","text":"Start End Event Instructor 8:00 8:45 Breakfast in Computer Sciences 1240 - 9:00 9:45 Lecture: Introduction to dHTC and the OSPool Tim C. 9:45 10:30 Exercises: Using the OSPool - 10:30 10:45 Break Travel document collection, as needed - 10:45 11:30 Lecture: Troubleshooting jobs Showmic 11:30 12:15 Exercises: Basic troubleshooting tools - 12:15 13:30 Lunch in Computer Sciences (near 1240) 13:15: Return documents in 1240 - 13:30 14:45 Interactive: High Throughput Computing in action staff 14:45 15:00 Break - 15:00 15:45 Lecture: Software portability Rachel 15:45 17:00 Exercises: Software and unfinished exercises Individual consultations staff 19:00 20:30 Evening work session (optional) Memorial Union \u2013 Council Room (4th Floor) Christina, Amber, Tim","title":"August 6 (Tuesday)"},{"location":"schedule/#august-7-wednesday","text":"Start End Event Instructor 8:00 8:45 Breakfast in Computer Sciences 1240 - 9:00 9:45 Lecture: Working with data Andrew 9:45 10:45 Exercises: Data - 10:45 11:00 Break - 11:00 12:00 HTC Showcase Part 1 \u25b6 Michael Gerard ; Nuclear Engineering & Engineering Physics \u201cUsing CHTC to optimize the Helically Symmetric eXperiment stellarator\u201d \u25b6 Bryce Johnson ; Morgridge Institute for Research & UW\u2013Madison Computer Sciences \u201cRunning millions of biophysical simulations with OSPool\u201d - 12:00 12:30 Open Q&A and discussion time staff 12:30 13:45 Lunch in Computer Sciences (near 1240) Optional Domain Lunches: Christina (math); Rachel (biology); Andrew/Amber (chemistry); Ian (ML); Tim (physics & astronomy) - 13:45 17:00 Afternoon off \u2014 suggestions for fun Or, optional work time Or, individual consultations staff 19:00 20:30 Evening work session (optional) Memorial Union \u2013 Langdon Room (4th Floor) Christina, Showmic","title":"August 7 (Wednesday)"},{"location":"schedule/#august-8-thursday","text":"Start End Event Instructor 8:00 8:45 Breakfast in Computer Sciences 1240 - 9:00 9:45 Lecture: Independence in Research Computing Christina 9:45 10:45 Exercises: Scaling up - 10:45 11:00 Break - 11:00 12:00 Lecture: DAGMan Rachel 12:00 13:15 Lunch in Computer Sciences (near 1240) - 13:15 14:30 Exercises: DAGMan Work Time: Apply HTC to own research Individual consultations staff 14:30 14:45 Break - 14:45 15:45 Work Time: Apply HTC to own research Individual consultations staff 15:45 16:30 Lecture: Machine Learning Ian 19:00 20:30 Evening work session (optional) Memorial Union \u2013 Council Room (4th Floor) Andrew, Showmic, Tim","title":"August 8 (Thursday)"},{"location":"schedule/#august-9-friday","text":"Start End Event Instructor 8:00 8:45 Breakfast in Computer Sciences 1240 - 9:00 9:45 Optional Lecture: Self-Checkpointing Work Time: Apply HTC to own research Showmic 9:45 10:30 Work time: Apply HTC to own research Individual consultations staff 10:30 10:45 Break - 10:45 11:30 Work time: Apply HTC to own research - 11:30 11:50 Group photo (details TBD) - 11:50 13:00 Lunch, Computer Sciences (Staff to direct) Optional: Introduction to Research Computing Facilitation Christina 13:00 14:00 HTC Showcase, Part 2 \u25b6 Dan Wright ; Civil & Environmental Engineering \u201cComputational hydroclimate research enabled by HTC\u201d \u25b6 Saloni Bhogale ; \u201cTBD\u201d - 14:00 14:30 Open Q&A Work time: Apply HTC to own research Break - 14:30 15:30 Lightning talks by volunteer participants Attendees 15:30 16:00 Open Q&A and work time staff 16:00 16:45 HTC and HTCondor Philosophy Miron (and Greg?) 16:45 17:15 Lecture: Forward Tim Closing Dinner for Participants and Staff Location: Union South (next to Computer Sciences), indoors/outdoors Time: 6:00\u20137:45 p.m. (or so) We have scheduled a buffet dinner to wrap up the School. The buffet itself, and some tables, are inside. But if the weather is good, we will have easy access to outdoor space, too, and we encourage everyone to head outside with their food.","title":"August 9 (Friday)"},{"location":"logistics/","text":"OSG School 2024 Logistics \u00b6 The following pages describe some of the important information about your visit to Madison for the OSG School. Please read them carefully. There will be other pages with local details soon. Visa requirements for non-residents Travel planning to and from Madison Hotel information As always: If you have questions, email us at school@osg-htc.org . Use that email address for all emails about the organization of the School. General Information About the School Schedule \u00b6 Travel Schedule \u00b6 Most participants should plan to travel as follows: Arrive on Sunday, August 4, 2024, by about 5:00 p.m. (if possible). There is a welcome dinner on Sunday evening for all participants (including instructors), and then classes begin on Monday morning. This is a nice way to get to know each other and start the week. Depart on Saturday, August 10, 2024, any time. The School ends with a closing dinner on Friday evening, so it is best to stay that night. If we offered to pay for your hotel room, we will pay for the 6 nights of this schedule. Note: If we suggested other travel dates to you in an email, then use those dates instead! School Hours \u00b6 The School is generally Monday through Friday, 9:00 a.m. to about 5:00 p.m., except Wednesday afternoon. There will be optional work sessions on Monday, Tuesday, Wednesday, and Thursday evenings. A detailed schedule will be posted before the event. Contact Information \u00b6 If you have questions, do not wait to contact us! school@osg-htc.org","title":"General information"},{"location":"logistics/#osg-school-2024-logistics","text":"The following pages describe some of the important information about your visit to Madison for the OSG School. Please read them carefully. There will be other pages with local details soon. Visa requirements for non-residents Travel planning to and from Madison Hotel information As always: If you have questions, email us at school@osg-htc.org . Use that email address for all emails about the organization of the School.","title":"OSG School 2024 Logistics"},{"location":"logistics/#general-information-about-the-school-schedule","text":"","title":"General Information About the School Schedule"},{"location":"logistics/#travel-schedule","text":"Most participants should plan to travel as follows: Arrive on Sunday, August 4, 2024, by about 5:00 p.m. (if possible). There is a welcome dinner on Sunday evening for all participants (including instructors), and then classes begin on Monday morning. This is a nice way to get to know each other and start the week. Depart on Saturday, August 10, 2024, any time. The School ends with a closing dinner on Friday evening, so it is best to stay that night. If we offered to pay for your hotel room, we will pay for the 6 nights of this schedule. Note: If we suggested other travel dates to you in an email, then use those dates instead!","title":"Travel Schedule"},{"location":"logistics/#school-hours","text":"The School is generally Monday through Friday, 9:00 a.m. to about 5:00 p.m., except Wednesday afternoon. There will be optional work sessions on Monday, Tuesday, Wednesday, and Thursday evenings. A detailed schedule will be posted before the event.","title":"School Hours"},{"location":"logistics/#contact-information","text":"If you have questions, do not wait to contact us! school@osg-htc.org","title":"Contact Information"},{"location":"logistics/account-setup/","text":".hi { font-weight: bold; color: #FF6600; } Apply for Computing Access \u00b6 We will be using two different Access Points during the OSG School - ap40.uw.osg-htc.org and ap1.facility.path-cc.io . As soon as possible please request your account access using this link: OSG School Account Registration Instructions on setting up your account can be found using this guide: Log in to uw.osg-htc.org Access Points We strongly recommend going through the registration process and trying to log in before the School, ideally before your OSG orientation session. If you run into problems contact us at support@osg-htc.org .","title":"Account setup"},{"location":"logistics/account-setup/#apply-for-computing-access","text":"We will be using two different Access Points during the OSG School - ap40.uw.osg-htc.org and ap1.facility.path-cc.io . As soon as possible please request your account access using this link: OSG School Account Registration Instructions on setting up your account can be found using this guide: Log in to uw.osg-htc.org Access Points We strongly recommend going through the registration process and trying to log in before the School, ideally before your OSG orientation session. If you run into problems contact us at support@osg-htc.org .","title":"Apply for Computing Access"},{"location":"logistics/dining/","text":"Dining \u00b6 The School provides some catered meals as a group, and you are on your own for others. When on your own, there are many dining options in Madison between the School and your hotel, especially on State Street which is only blocks away from both locations. Restaurants right on and very near to the Capitol Square, onto which the hotel faces, tend to be a little more expensive. As you go toward campus on State Street or neighboring streets, prices tend to go down. But of course, there are exceptions in both directions! It is reasonable to ask to see a menu before ordering or being seated and decide whether to stay. Food Options Near the Hotel \u00b6 Use a mapping app or rating services like Yelp to look food options. For example: Food Options Near the School \u00b6 There are not a lot of great food options very close to the School, but feel free to ask School staff for suggestions.","title":"Dining options"},{"location":"logistics/dining/#dining","text":"The School provides some catered meals as a group, and you are on your own for others. When on your own, there are many dining options in Madison between the School and your hotel, especially on State Street which is only blocks away from both locations. Restaurants right on and very near to the Capitol Square, onto which the hotel faces, tend to be a little more expensive. As you go toward campus on State Street or neighboring streets, prices tend to go down. But of course, there are exceptions in both directions! It is reasonable to ask to see a menu before ordering or being seated and decide whether to stay.","title":"Dining"},{"location":"logistics/dining/#food-options-near-the-hotel","text":"Use a mapping app or rating services like Yelp to look food options. For example:","title":"Food Options Near the Hotel"},{"location":"logistics/dining/#food-options-near-the-school","text":"There are not a lot of great food options very close to the School, but feel free to ask School staff for suggestions.","title":"Food Options Near the School"},{"location":"logistics/fun-day/","text":"Fun Activity Ideas While in Madison \u00b6 Free \u00b6 Narrated tour via UW-Madison app : Discover UW\u2013Madison using our free mobile app featuring a student-led narrated tour that is self-guided. The tour includes information about buildings, academics, transportation, housing, and all things surrounding the student experience. Start at Union South, 1308 W Dayton St (~1 minute walk from School) UW\u2013Madison Geology Museum : Large collection of geological specimens. Across Dayton Street from the School building. 1215 Dayton Street (~2 minute walk from School) L.R. Ingersoll Physics Museum : Small museum of Physics objects and demonstrations. Very short walk from the School building: Chamberlin Hall, 1150 University Avenue. (~6 minute walk from School) Terrace Open Mic Night : Enjoy a night out where all styles of music, comedy, spoken word, poetry, and more take the stage. Performances start at 7 PM on Wednesday. 800 Langdon Street (~15 minute walk from School) Tour of Wisconsin State Capitol : Tours start at 1, 2, 3, and 4 p.m. and last about 45 minutes. 2 E Main Street (~29 minute walk from School and across the Park Hotel) Henry Vilas Zoo : One-mile walk south of Computer Sciences: 702 South Randall Avenue. (~18 minute walk from School) Take a stroll or a ride on The Lakeshore Path : Reach the infamous Picnic Point or take your trip to the Arboretum! Cost \u00b6 Rent a Bcycle : Take advantage of Madison's many bike paths throughout Madison. Camp Randall Guided Tour : 1440 Monroe St; Tour starts promptly at 2:30 PM on Wednesday and will approximately last one hour; $10 per person (~8 minute walk from School) Paddling rentals on Lake Mendota : Paddling rentals, including paddleboards, kayak, and canoes. Memorial Union Terrace, $18 per hour. 800 Langdon Street (~15 minute walk from School) Tour of First Unitarian Society\u2019s Meeting House : The Landmark Auditorium was designed by Frank Lloyd Wright. $15 per person ($12.50 if booked online in advance), up to 10 people. 900 University Bay Drive (~38 minute walk; Bus accessible, with close stop) Olbrich Botanical Gardens : 16 acres outdoor (FREE); indoor: $6 conservatory; $8 butterfly house. 3330 Atwood Avenue (~15 minute drive from School; Bus accessible, with close stop) Disclaimer \u00b6 The Chazen Museum of Art has a summer closure from August 5th-9th. Madison Museum of Contemporary Art (MMoCA) is closed on Wednesdays.","title":"Madison Fun Day"},{"location":"logistics/fun-day/#fun-activity-ideas-while-in-madison","text":"","title":"Fun Activity Ideas While in Madison"},{"location":"logistics/fun-day/#free","text":"Narrated tour via UW-Madison app : Discover UW\u2013Madison using our free mobile app featuring a student-led narrated tour that is self-guided. The tour includes information about buildings, academics, transportation, housing, and all things surrounding the student experience. Start at Union South, 1308 W Dayton St (~1 minute walk from School) UW\u2013Madison Geology Museum : Large collection of geological specimens. Across Dayton Street from the School building. 1215 Dayton Street (~2 minute walk from School) L.R. Ingersoll Physics Museum : Small museum of Physics objects and demonstrations. Very short walk from the School building: Chamberlin Hall, 1150 University Avenue. (~6 minute walk from School) Terrace Open Mic Night : Enjoy a night out where all styles of music, comedy, spoken word, poetry, and more take the stage. Performances start at 7 PM on Wednesday. 800 Langdon Street (~15 minute walk from School) Tour of Wisconsin State Capitol : Tours start at 1, 2, 3, and 4 p.m. and last about 45 minutes. 2 E Main Street (~29 minute walk from School and across the Park Hotel) Henry Vilas Zoo : One-mile walk south of Computer Sciences: 702 South Randall Avenue. (~18 minute walk from School) Take a stroll or a ride on The Lakeshore Path : Reach the infamous Picnic Point or take your trip to the Arboretum!","title":"Free"},{"location":"logistics/fun-day/#cost","text":"Rent a Bcycle : Take advantage of Madison's many bike paths throughout Madison. Camp Randall Guided Tour : 1440 Monroe St; Tour starts promptly at 2:30 PM on Wednesday and will approximately last one hour; $10 per person (~8 minute walk from School) Paddling rentals on Lake Mendota : Paddling rentals, including paddleboards, kayak, and canoes. Memorial Union Terrace, $18 per hour. 800 Langdon Street (~15 minute walk from School) Tour of First Unitarian Society\u2019s Meeting House : The Landmark Auditorium was designed by Frank Lloyd Wright. $15 per person ($12.50 if booked online in advance), up to 10 people. 900 University Bay Drive (~38 minute walk; Bus accessible, with close stop) Olbrich Botanical Gardens : 16 acres outdoor (FREE); indoor: $6 conservatory; $8 butterfly house. 3330 Atwood Avenue (~15 minute drive from School; Bus accessible, with close stop)","title":"Cost"},{"location":"logistics/fun-day/#disclaimer","text":"The Chazen Museum of Art has a summer closure from August 5th-9th. Madison Museum of Contemporary Art (MMoCA) is closed on Wednesdays.","title":"Disclaimer"},{"location":"logistics/hotel/","text":".hi { font-weight: bold; color: #FF6600; } Hotel Information \u00b6 We reserved a block of rooms at an area hotel for participants from outside Madison. Best Western Premier Park Hotel 22 South Carroll Street, Madison, WI +1 (608) 285\u20118000 Please note: We will reserve your room for you, so do not contact the hotel yourself to reserve a room. Exceptions to this rule are rare and clearly communicated. Other important hotel information: Before the School, we will send you an email with your hotel confirmation number We pay only for basic room costs \u2014 you must provide a credit card to cover extra costs There is one School participant per room; to have friends or family stay with you, please ask us now Check-In Time \u00b6 The (earliest) check-in time at the hotel is 4 p.m. on your day of arrival. If you are arriving earlier, you have options: Ask the hotel if it is possible to check in earlier than 4 p.m. It is up to the hotel to decide if they can meet your request. If there is any additional expense required, you must pay that yourself. Ask the hotel to put your bags in a safe spot and enjoy Madison until 4 p.m. or later. Keep your bags with you and enjoy Madison until 4 p.m. or later. Check-Out Time \u00b6 The (latest) check-out time from the hotel is 11 a.m. on your day of departure. If you are leaving later, you have options: Ask the hotel to put your bags in a safe spot and enjoy Madison until it is time to leave. Keep your bags with you and enjoy Madison until it is time to leave. You are not required to travel directly from the hotel to the airport, but if you do, we may be able to help you arrange to use the free hotel shuttle.","title":"Hotel information"},{"location":"logistics/hotel/#hotel-information","text":"We reserved a block of rooms at an area hotel for participants from outside Madison. Best Western Premier Park Hotel 22 South Carroll Street, Madison, WI +1 (608) 285\u20118000 Please note: We will reserve your room for you, so do not contact the hotel yourself to reserve a room. Exceptions to this rule are rare and clearly communicated. Other important hotel information: Before the School, we will send you an email with your hotel confirmation number We pay only for basic room costs \u2014 you must provide a credit card to cover extra costs There is one School participant per room; to have friends or family stay with you, please ask us now","title":"Hotel Information"},{"location":"logistics/hotel/#check-in-time","text":"The (earliest) check-in time at the hotel is 4 p.m. on your day of arrival. If you are arriving earlier, you have options: Ask the hotel if it is possible to check in earlier than 4 p.m. It is up to the hotel to decide if they can meet your request. If there is any additional expense required, you must pay that yourself. Ask the hotel to put your bags in a safe spot and enjoy Madison until 4 p.m. or later. Keep your bags with you and enjoy Madison until 4 p.m. or later.","title":"Check-In Time"},{"location":"logistics/hotel/#check-out-time","text":"The (latest) check-out time from the hotel is 11 a.m. on your day of departure. If you are leaving later, you have options: Ask the hotel to put your bags in a safe spot and enjoy Madison until it is time to leave. Keep your bags with you and enjoy Madison until it is time to leave. You are not required to travel directly from the hotel to the airport, but if you do, we may be able to help you arrange to use the free hotel shuttle.","title":"Check-Out Time"},{"location":"logistics/local-transportation/","text":"Local Transportation \u00b6 You are responsible for your own transportation within Madison, but we will help coordinate and can reimburse costs between the airport and your hotel. Travel Between the Madison Airport and Your Hotel \u00b6 For travel between the Madison airport (Dane County Regional Airport) and the School hotel , the best option is the hotel shuttle service, when available. Otherwise, you may use a ride-sharing service or taxi. See below for details. We will help organize groups to take shuttles and taxis, based on arrival and departure times. Shuttle/taxi groups will be formed and emailed shortly before the School itself. Travel Between the Hotel and Campus \u00b6 For travel between the School hotel and the Computer Sciences building on campus, walking is a great option. Also, the hotel shuttle service may be available, especially if organized in advance. See below for details. Options for Getting Around \u00b6 Hotel Shuttle \u00b6 The Park Hotel operates a free shuttle service. The shuttle may not be available at all times, though, and it is best to plan ahead. Work with the hotel staff, individually or even better in groups, to use the shuttle. As noted above, we will help organize groups for the shuttle for airport arrivals on Sunday and departures on Saturday. To ask about the shuttle, either stop by the front desk of the hotel, or call +1 (608) 285-8000 and press 0 for the front desk. Explain that you are a guest at the hotel and ask if the shuttle is available for the number of people in your group; be clear about where you want to go from and to and at what time. We will send the hotel our list of groups who would like the shuttle for airport trips, but it is still best for the leader of each group to check with the hotel anyway. Walking \u00b6 It is easy to walk in and around the University of Wisconsin\u2013Madison campus, with many Madison landmarks within a mile of the School and your hotel. Use a mapping app or ask us or your hotel for a map. In particular, State Street \u2014 which connects the Capitol Square with the UW campus \u2014 is full of great restaurants and shops and is worth walking along while you are here. City of Madison Metro Bus Service \u00b6 Many Madison Metro buses stop near the hotel and pass through the University of Wisconsin\u2013Madison campus. Bus fare is $2.00, and if using a transfer ask the driver for a free transfer pass upon boarding. Google Maps is a great resource for finding the best bus routes to use in Madison, giving multiple route options for each trip. Additionally, the Madison Metro Website provides a web interface to plan your trip. Note Bus routes stop running around ~11pm each day. Taxis and Ride-Sharing Services \u00b6 Both Lyft and Uber are active in Madison, or you can choose from our local taxi companies, such as Madison Taxi and Union Cab . We cannot recommend any particular option, but those are some options we know about. Note We cannot reimburse for any taxi or rideshare service beyond the ride to and from the airport. Note We will need receipts for any ride-share or taxi fare over $25. Madison BCycle \u00b6 Madison is a great city to bike in, and there is even a short-term bike rental system called BCycle . Bcycles are available throughout the city , including near the hotel and around campus. Pricing for Bcycles can be found on their website and consist of several tiers. Note Unfortunately, we are not able to reimburse BCycle costs.","title":"Local transportation"},{"location":"logistics/local-transportation/#local-transportation","text":"You are responsible for your own transportation within Madison, but we will help coordinate and can reimburse costs between the airport and your hotel.","title":"Local Transportation"},{"location":"logistics/local-transportation/#travel-between-the-madison-airport-and-your-hotel","text":"For travel between the Madison airport (Dane County Regional Airport) and the School hotel , the best option is the hotel shuttle service, when available. Otherwise, you may use a ride-sharing service or taxi. See below for details. We will help organize groups to take shuttles and taxis, based on arrival and departure times. Shuttle/taxi groups will be formed and emailed shortly before the School itself.","title":"Travel Between the Madison Airport and Your Hotel"},{"location":"logistics/local-transportation/#travel-between-the-hotel-and-campus","text":"For travel between the School hotel and the Computer Sciences building on campus, walking is a great option. Also, the hotel shuttle service may be available, especially if organized in advance. See below for details.","title":"Travel Between the Hotel and Campus"},{"location":"logistics/local-transportation/#options-for-getting-around","text":"","title":"Options for Getting Around"},{"location":"logistics/local-transportation/#hotel-shuttle","text":"The Park Hotel operates a free shuttle service. The shuttle may not be available at all times, though, and it is best to plan ahead. Work with the hotel staff, individually or even better in groups, to use the shuttle. As noted above, we will help organize groups for the shuttle for airport arrivals on Sunday and departures on Saturday. To ask about the shuttle, either stop by the front desk of the hotel, or call +1 (608) 285-8000 and press 0 for the front desk. Explain that you are a guest at the hotel and ask if the shuttle is available for the number of people in your group; be clear about where you want to go from and to and at what time. We will send the hotel our list of groups who would like the shuttle for airport trips, but it is still best for the leader of each group to check with the hotel anyway.","title":"Hotel Shuttle"},{"location":"logistics/local-transportation/#walking","text":"It is easy to walk in and around the University of Wisconsin\u2013Madison campus, with many Madison landmarks within a mile of the School and your hotel. Use a mapping app or ask us or your hotel for a map. In particular, State Street \u2014 which connects the Capitol Square with the UW campus \u2014 is full of great restaurants and shops and is worth walking along while you are here.","title":"Walking"},{"location":"logistics/local-transportation/#city-of-madison-metro-bus-service","text":"Many Madison Metro buses stop near the hotel and pass through the University of Wisconsin\u2013Madison campus. Bus fare is $2.00, and if using a transfer ask the driver for a free transfer pass upon boarding. Google Maps is a great resource for finding the best bus routes to use in Madison, giving multiple route options for each trip. Additionally, the Madison Metro Website provides a web interface to plan your trip. Note Bus routes stop running around ~11pm each day.","title":"City of Madison Metro Bus Service"},{"location":"logistics/local-transportation/#taxis-and-ride-sharing-services","text":"Both Lyft and Uber are active in Madison, or you can choose from our local taxi companies, such as Madison Taxi and Union Cab . We cannot recommend any particular option, but those are some options we know about. Note We cannot reimburse for any taxi or rideshare service beyond the ride to and from the airport. Note We will need receipts for any ride-share or taxi fare over $25.","title":"Taxis and Ride-Sharing Services"},{"location":"logistics/local-transportation/#madison-bcycle","text":"Madison is a great city to bike in, and there is even a short-term bike rental system called BCycle . Bcycles are available throughout the city , including near the hotel and around campus. Pricing for Bcycles can be found on their website and consist of several tiers. Note Unfortunately, we are not able to reimburse BCycle costs.","title":"Madison BCycle"},{"location":"logistics/location/","text":"School Location \u00b6 The school will be held at the University of Wisconsin\u2013Madison in the Computer Sciences Building , located at 1210 West Dayton Street, Madison, WI, 53706 . This location is about 1.3 miles from your hotel. The main classroom is Room 1240 (see below). See the local transportation page for suggestions about getting around Madison. Computer Sciences Building, Room 1240 \u00b6 Most School sessions are held in Room 1240 . If you enter the building from Dayton Street: Enter straight into the building from the street Immediately turn left and go through two sets of doors Pass the elevator (on your right) and walk down the hallway 1240 is on your right up the few steps Generally, just follow signs for 1240. Restrooms \u00b6 There are restrooms across the hallway and a bit to the right of 1240. For those, or other options, just ask staff!","title":"School location"},{"location":"logistics/location/#school-location","text":"The school will be held at the University of Wisconsin\u2013Madison in the Computer Sciences Building , located at 1210 West Dayton Street, Madison, WI, 53706 . This location is about 1.3 miles from your hotel. The main classroom is Room 1240 (see below). See the local transportation page for suggestions about getting around Madison.","title":"School Location"},{"location":"logistics/location/#computer-sciences-building-room-1240","text":"Most School sessions are held in Room 1240 . If you enter the building from Dayton Street: Enter straight into the building from the street Immediately turn left and go through two sets of doors Pass the elevator (on your right) and walk down the hallway 1240 is on your right up the few steps Generally, just follow signs for 1240.","title":"Computer Sciences Building, Room 1240"},{"location":"logistics/location/#restrooms","text":"There are restrooms across the hallway and a bit to the right of 1240. For those, or other options, just ask staff!","title":"Restrooms"},{"location":"logistics/meals/","text":"Meal Information \u00b6 The School includes some group catered meals for all participants: Sunday (Aug. 4) \u2014 welcome dinner Monday (Aug. 5) \u2013 Friday (Aug. 9) \u2014 breakfast and lunch each day Friday (Aug. 9) \u2014 closing dinner Other meals not listed above are on your own. If you are not a member of the UW\u2013Madison community, we will reimburse you for the on-your-own meals, Monday through Thursday dinners; see below for details. Sorry, UW\u2013Madison folks: The rules say that we cannot reimburse you for meals here. For the meals on your own, you are welcome to join other participants and even staff! We can help with ideas and groups, if you like. There is another page with suggestions for finding dining options near the School and hotel. Catered Meals \u00b6 The catered breakfasts and lunches during the School (see above) will be served in the Computer Sciences Building. Breakfast is in the main auditorium, room 1240 , and lunch is nearby (staff will lead the way on Monday). There is nearby seating both inside and outside. Menus \u00b6 The catered meals should take into account all dietary needs that you told us about in the questionnaire. Check for labels! If you have questions, ask the catering staff (if present) or School staff. Some items, like gluten-free items, are provided in low quantities that are meant just for those people who requested them. Please do not take them unless they are for you. Sunday, August 4, 2024 \u00b6 Opening Dinner (6:30 PM - 8:30 PM) \u00b6 Location: Fluno Center - Skyview Room (on the 8th Floor) Cavatappi Pasta Gluten Free Pasta Cheese Lasagna Grilled Chicken Breast Homemade Chicken & Beef Meatballs Italian Vegetable Blend Breadsticks Marinara and Alfredo Sauce Caesar Salad Tiramisu Cannolis Includes Beverage Service Monday, August 5, 2024 \u00b6 Breakfast (8:00 AM - 9:00 AM) \u00b6 Badger Breakfast Turkey Sausage Links Vegan Sausage Patties Assorted Breakfast Pastries Seasonal Fresh Cut Fruit Salad Breakfast Potatoes Scrambled Eggs Regular Coffee Assorted Bottled Juice Hot Tea Lunch (12:15 PM - 1:15 PM) \u00b6 Southwest Buffet Tortilla Chips Red Salsa Spanish Rice Black Beans Beef Barbacoa Chicken Tinga Vegan Chorizo Crumble Flour Tortillas/Corn Tortillas for GF Shredded Lettuce Diced Tomatoes Jalapeno Shredded Cheddar Cheese Sour Cream Guacamole Assorted Soda, Water, and Sparkling Water PM Break (12:30 PM - 4:00 PM) \u00b6 Assorted Soda, Water, and Sparkling Water Regular Coffee Assorted Cookies Gluten Free Cookie Tuesday, August 6, 2024 \u00b6 Breakfast (8:00 AM - 9:00 AM) \u00b6 Fresh Cut Fruit Salad Assorted Muffins Gluten Free Muffin Mini Quiches Turkey Sausage Links Vegan Sausage Patties Assorted Bottled Juices Hot Tea Regular Coffee Lunch (12:15 PM - 1:15 PM) \u00b6 Italian Buffet Caesar Salad (croutons, cheese & Kalamata olives on the side) Caesar Dressing Garlic Breadsticks Pasta Gluten Free Pasta Marinara Sauce Sliced Grilled Chicken Breast Vegan Meatballs Assorted Soda, Water, and Sparkling Water PM Break (12:30 PM - 4:00 PM) \u00b6 Assorted Dessert Bars Granola Bars (GF) Regular Coffee Wednesday, August 7, 2024 \u00b6 Breakfast (8:00 AM - 9:00 AM) \u00b6 Turkey Sausage Links Vegan Sausage Patties Assorted Breakfast Pastries Gluten Free Muffin Seasonal Fresh Cut Fruit Salad Breakfast Potatoes Scrambled Eggs Regular Coffee Assorted Bottled Juice Hot Tea Lunch (12:30 PM - 1:45 PM) \u00b6 Boxed Lunches Chicken Bacon Ranch Wraps Smoked Turkey Sandwiches Southwest Salads Cookies Gluten Free Cookie Assorted Chips Assorted Chips Mediterranean Antipasto Platter Vegetable Platter with Dill Dip Italian Cold Pasta Assorted Soda, Water, and Sparkling Water Thursday, August 8, 2024 \u00b6 Breakfast (8:00 AM - 9:00 AM) \u00b6 Buckingham Breakfast Assorted Breakfast Pastries Gluten Free Muffin Seasonal Fresh Cut Fruit Salad Regular Coffee Assorted Bottled Juice Mini Quiches Turkey Sausage Links Vegan Sausage Patties Hot Tea Lunch (12:00 PM - 1:15 PM) \u00b6 Mediteranian Buffet Lemon Oregano Chicken Greek Salad with Olive Oil Vinaigrette Roasted Vegetable Couscous Stuffed Mediterranean Portobello Mushrooms (with and without feta) PM Break (12:30 PM - 4:30 PM) \u00b6 Regular Coffee Assorted Soda, Water, and Sparkling Water Assorted Cookies Gluten Free Cookie Friday, August 9, 2024 \u00b6 Breakfast (8:00 AM - 9:00 AM) \u00b6 Badger Breakfast Turkey Sausage Links Bacon Vegan Sausage Patties Assorted Breakfast Pastries Gluten Free Muffin Seasonal Fresh Cut Fruit Salad Breakfast Potatoes Scrambled Eggs Cinnamon Rolls Regular Coffee Assorted Bottled Juice Hot Tea Lunch (12:00 PM - 1:00 PM) \u00b6 Wisconsin Tailgate Garden Salad with Ranch and Balsamic dressing Fried Wedge Potatoes with Ketchup Brats with Kraut Diced Onions Ketchup Dijon Mustard Hamburgers Veggie Burgers Hamburger Buns / Gluten Free Bun Lettuce, Tomato, Onion platter Sliced Cheddar Cheese platter Pickles Ketchup, Mustard, Mayo PM Break (12:30 PM - 4:00 PM) \u00b6 Assorted Soda, Water, and Sparkling Water Regular Coffee Brownies Closing Dinner (6:00 PM - 8:00 PM) \u00b6 Location: Union South - Industry (3rd Floor) Global Buffet Spinach, Strawberry, Shaved Red Onion, Sesame Poppy Seed Dressing Vegetables, Dips, Spreads, Pita Chips Chicken Tikka Masala Sake Salmon Jerk Tofu Basmati Rice Naan Includes choice of coffee station or assorted cold beverages Meal Reimbursement Tips \u00b6 Again, if you are not part of the UW\u2013Madison community, we can reimburse you for dinners Monday through Thursday. We have curated a page of some possible dining options to use as inspiration. Some tips for successful reimbursements: Keep receipts for your meals \u2013 if anything so that you remember how much meals cost! We can reimburse up to $35 for dinner, including tax and tip. If it is not on the receipt, be sure to write the tip amount yourself, so you do not forget. We cannot pay for any alcohol, although non-alcoholic drinks are OK \u2014 ideally, pay for alcohol separately. We will explain the reimbursement process in detail after the School, but the tips above will help.","title":"Meal information"},{"location":"logistics/meals/#meal-information","text":"The School includes some group catered meals for all participants: Sunday (Aug. 4) \u2014 welcome dinner Monday (Aug. 5) \u2013 Friday (Aug. 9) \u2014 breakfast and lunch each day Friday (Aug. 9) \u2014 closing dinner Other meals not listed above are on your own. If you are not a member of the UW\u2013Madison community, we will reimburse you for the on-your-own meals, Monday through Thursday dinners; see below for details. Sorry, UW\u2013Madison folks: The rules say that we cannot reimburse you for meals here. For the meals on your own, you are welcome to join other participants and even staff! We can help with ideas and groups, if you like. There is another page with suggestions for finding dining options near the School and hotel.","title":"Meal Information"},{"location":"logistics/meals/#catered-meals","text":"The catered breakfasts and lunches during the School (see above) will be served in the Computer Sciences Building. Breakfast is in the main auditorium, room 1240 , and lunch is nearby (staff will lead the way on Monday). There is nearby seating both inside and outside.","title":"Catered Meals"},{"location":"logistics/meals/#menus","text":"The catered meals should take into account all dietary needs that you told us about in the questionnaire. Check for labels! If you have questions, ask the catering staff (if present) or School staff. Some items, like gluten-free items, are provided in low quantities that are meant just for those people who requested them. Please do not take them unless they are for you.","title":"Menus"},{"location":"logistics/meals/#sunday-august-4-2024","text":"","title":"Sunday, August 4, 2024"},{"location":"logistics/meals/#opening-dinner-630-pm-830-pm","text":"Location: Fluno Center - Skyview Room (on the 8th Floor) Cavatappi Pasta Gluten Free Pasta Cheese Lasagna Grilled Chicken Breast Homemade Chicken & Beef Meatballs Italian Vegetable Blend Breadsticks Marinara and Alfredo Sauce Caesar Salad Tiramisu Cannolis Includes Beverage Service","title":"Opening Dinner (6:30 PM - 8:30 PM)"},{"location":"logistics/meals/#monday-august-5-2024","text":"","title":"Monday, August 5, 2024"},{"location":"logistics/meals/#breakfast-800-am-900-am","text":"Badger Breakfast Turkey Sausage Links Vegan Sausage Patties Assorted Breakfast Pastries Seasonal Fresh Cut Fruit Salad Breakfast Potatoes Scrambled Eggs Regular Coffee Assorted Bottled Juice Hot Tea","title":"Breakfast (8:00 AM - 9:00 AM)"},{"location":"logistics/meals/#lunch-1215-pm-115-pm","text":"Southwest Buffet Tortilla Chips Red Salsa Spanish Rice Black Beans Beef Barbacoa Chicken Tinga Vegan Chorizo Crumble Flour Tortillas/Corn Tortillas for GF Shredded Lettuce Diced Tomatoes Jalapeno Shredded Cheddar Cheese Sour Cream Guacamole Assorted Soda, Water, and Sparkling Water","title":"Lunch (12:15 PM - 1:15 PM)"},{"location":"logistics/meals/#pm-break-1230-pm-400-pm","text":"Assorted Soda, Water, and Sparkling Water Regular Coffee Assorted Cookies Gluten Free Cookie","title":"PM Break (12:30 PM - 4:00 PM)"},{"location":"logistics/meals/#tuesday-august-6-2024","text":"","title":"Tuesday, August 6, 2024"},{"location":"logistics/meals/#breakfast-800-am-900-am_1","text":"Fresh Cut Fruit Salad Assorted Muffins Gluten Free Muffin Mini Quiches Turkey Sausage Links Vegan Sausage Patties Assorted Bottled Juices Hot Tea Regular Coffee","title":"Breakfast (8:00 AM - 9:00 AM)"},{"location":"logistics/meals/#lunch-1215-pm-115-pm_1","text":"Italian Buffet Caesar Salad (croutons, cheese & Kalamata olives on the side) Caesar Dressing Garlic Breadsticks Pasta Gluten Free Pasta Marinara Sauce Sliced Grilled Chicken Breast Vegan Meatballs Assorted Soda, Water, and Sparkling Water","title":"Lunch (12:15 PM - 1:15 PM)"},{"location":"logistics/meals/#pm-break-1230-pm-400-pm_1","text":"Assorted Dessert Bars Granola Bars (GF) Regular Coffee","title":"PM Break (12:30 PM - 4:00 PM)"},{"location":"logistics/meals/#wednesday-august-7-2024","text":"","title":"Wednesday, August 7, 2024"},{"location":"logistics/meals/#breakfast-800-am-900-am_2","text":"Turkey Sausage Links Vegan Sausage Patties Assorted Breakfast Pastries Gluten Free Muffin Seasonal Fresh Cut Fruit Salad Breakfast Potatoes Scrambled Eggs Regular Coffee Assorted Bottled Juice Hot Tea","title":"Breakfast (8:00 AM - 9:00 AM)"},{"location":"logistics/meals/#lunch-1230-pm-145-pm","text":"Boxed Lunches Chicken Bacon Ranch Wraps Smoked Turkey Sandwiches Southwest Salads Cookies Gluten Free Cookie Assorted Chips Assorted Chips Mediterranean Antipasto Platter Vegetable Platter with Dill Dip Italian Cold Pasta Assorted Soda, Water, and Sparkling Water","title":"Lunch (12:30 PM - 1:45 PM)"},{"location":"logistics/meals/#thursday-august-8-2024","text":"","title":"Thursday, August 8, 2024"},{"location":"logistics/meals/#breakfast-800-am-900-am_3","text":"Buckingham Breakfast Assorted Breakfast Pastries Gluten Free Muffin Seasonal Fresh Cut Fruit Salad Regular Coffee Assorted Bottled Juice Mini Quiches Turkey Sausage Links Vegan Sausage Patties Hot Tea","title":"Breakfast (8:00 AM - 9:00 AM)"},{"location":"logistics/meals/#lunch-1200-pm-115-pm","text":"Mediteranian Buffet Lemon Oregano Chicken Greek Salad with Olive Oil Vinaigrette Roasted Vegetable Couscous Stuffed Mediterranean Portobello Mushrooms (with and without feta)","title":"Lunch (12:00 PM - 1:15 PM)"},{"location":"logistics/meals/#pm-break-1230-pm-430-pm","text":"Regular Coffee Assorted Soda, Water, and Sparkling Water Assorted Cookies Gluten Free Cookie","title":"PM Break (12:30 PM - 4:30 PM)"},{"location":"logistics/meals/#friday-august-9-2024","text":"","title":"Friday, August 9, 2024"},{"location":"logistics/meals/#breakfast-800-am-900-am_4","text":"Badger Breakfast Turkey Sausage Links Bacon Vegan Sausage Patties Assorted Breakfast Pastries Gluten Free Muffin Seasonal Fresh Cut Fruit Salad Breakfast Potatoes Scrambled Eggs Cinnamon Rolls Regular Coffee Assorted Bottled Juice Hot Tea","title":"Breakfast (8:00 AM - 9:00 AM)"},{"location":"logistics/meals/#lunch-1200-pm-100-pm","text":"Wisconsin Tailgate Garden Salad with Ranch and Balsamic dressing Fried Wedge Potatoes with Ketchup Brats with Kraut Diced Onions Ketchup Dijon Mustard Hamburgers Veggie Burgers Hamburger Buns / Gluten Free Bun Lettuce, Tomato, Onion platter Sliced Cheddar Cheese platter Pickles Ketchup, Mustard, Mayo","title":"Lunch (12:00 PM - 1:00 PM)"},{"location":"logistics/meals/#pm-break-1230-pm-400-pm_2","text":"Assorted Soda, Water, and Sparkling Water Regular Coffee Brownies","title":"PM Break (12:30 PM - 4:00 PM)"},{"location":"logistics/meals/#closing-dinner-600-pm-800-pm","text":"Location: Union South - Industry (3rd Floor) Global Buffet Spinach, Strawberry, Shaved Red Onion, Sesame Poppy Seed Dressing Vegetables, Dips, Spreads, Pita Chips Chicken Tikka Masala Sake Salmon Jerk Tofu Basmati Rice Naan Includes choice of coffee station or assorted cold beverages","title":"Closing Dinner (6:00 PM - 8:00 PM)"},{"location":"logistics/meals/#meal-reimbursement-tips","text":"Again, if you are not part of the UW\u2013Madison community, we can reimburse you for dinners Monday through Thursday. We have curated a page of some possible dining options to use as inspiration. Some tips for successful reimbursements: Keep receipts for your meals \u2013 if anything so that you remember how much meals cost! We can reimburse up to $35 for dinner, including tax and tip. If it is not on the receipt, be sure to write the tip amount yourself, so you do not forget. We cannot pay for any alcohol, although non-alcoholic drinks are OK \u2014 ideally, pay for alcohol separately. We will explain the reimbursement process in detail after the School, but the tips above will help.","title":"Meal Reimbursement Tips"},{"location":"logistics/travel-advice/","text":"Travel Advice \u00b6 This page offers some tips for traveling to and from the OSG School. When travelling, you may experience delays, changes, or cancellations due to weather, mechanical issues, and so on. It is good to be prepared for last-minute changes. Below are some tips and ideas for dealing with travel. For health guidelines, before or during the event, please see our health guidelines page . Checking In Early \u00b6 Airlines generally allow you to check in for your flights the day before. Doing so may save you time and hassle at the airport. Go to your airline website and look for the \u201cCheck In\u201d section, then follow the steps. Finding Flight Status \u00b6 Be sure to check your flight status often, starting the day before travel begins. While you can check the status of each flight individually on the airline website (or a third-party site), you may be able to view your entire trip at once. Go to your airline website, find their section for \u201cMy Trips\u201d or something similar, and use the six-character \u201cConfirmation Number\u201d on your itinerary plus your last name to access your full itinerary, including flight status for each segment. Definitely check your flight status before leaving for the airport! If Your Arrival in Madison Is Delayed \u00b6 If your flights change and you will arrive in Madison later than planned, think about what effect that will have: If you will arrive before Sunday, 6 p.m. (or so), you should be fine. If there is time, you can still go to the hotel first; if it is after 5:30 p.m. (or so), it may be best to go straight to the Fluno Center for the welcome dinner at 6:30 p.m. If you will arrive on Sunday but after 6 p.m. (or so), you will miss the welcome dinner. Go straight to the hotel and check in, then find dinner on your own; we can reimburse you in this case. Try to let us know about the situation, when you can. If you will arrive later than Sunday, just do your best to get here. Try to let us know about your situation as soon as you can. We can help deal things like the hotel and may be able to suggest travel options. If you need to make flight changes, see below. If Your Arrival Back Home Is Delayed \u00b6 If you flights back home are delayed, there is not as much that we can do. For example, it is not clear whether we can pay for changes on return flights. Contact your airline to find out how they will get you home. If You Must Make Flight Changes \u00b6 If one or more flights are cancelled, or if we approve flight changes and their fees in advance , you will need to make new plans with your airline. If you are at an airport, it is a good idea to get in line at your airline\u2019s service counter right away. Also, you can try calling their service number while waiting in line! For any change that requires extra payment, you must get our approval and make the change through Fox World Travel , UW\u2013Madison\u2019s only approved travel agency. If you pay for a change any other way, we cannot reimburse you. Fox World Travel phone number: +1 (844) 630-3853 Note: If you call Fox World Travel on the weekend or outside of 7am\u20137:30pm (Central), they will charge us $20 just for calling. So please use this option only when you must pay for approved flight changes. If there are significant changes to your travel plans, when you have time, please email us with your news or reach out to us on Slack.","title":"Travel advice"},{"location":"logistics/travel-advice/#travel-advice","text":"This page offers some tips for traveling to and from the OSG School. When travelling, you may experience delays, changes, or cancellations due to weather, mechanical issues, and so on. It is good to be prepared for last-minute changes. Below are some tips and ideas for dealing with travel. For health guidelines, before or during the event, please see our health guidelines page .","title":"Travel Advice"},{"location":"logistics/travel-advice/#checking-in-early","text":"Airlines generally allow you to check in for your flights the day before. Doing so may save you time and hassle at the airport. Go to your airline website and look for the \u201cCheck In\u201d section, then follow the steps.","title":"Checking In Early"},{"location":"logistics/travel-advice/#finding-flight-status","text":"Be sure to check your flight status often, starting the day before travel begins. While you can check the status of each flight individually on the airline website (or a third-party site), you may be able to view your entire trip at once. Go to your airline website, find their section for \u201cMy Trips\u201d or something similar, and use the six-character \u201cConfirmation Number\u201d on your itinerary plus your last name to access your full itinerary, including flight status for each segment. Definitely check your flight status before leaving for the airport!","title":"Finding Flight Status"},{"location":"logistics/travel-advice/#if-your-arrival-in-madison-is-delayed","text":"If your flights change and you will arrive in Madison later than planned, think about what effect that will have: If you will arrive before Sunday, 6 p.m. (or so), you should be fine. If there is time, you can still go to the hotel first; if it is after 5:30 p.m. (or so), it may be best to go straight to the Fluno Center for the welcome dinner at 6:30 p.m. If you will arrive on Sunday but after 6 p.m. (or so), you will miss the welcome dinner. Go straight to the hotel and check in, then find dinner on your own; we can reimburse you in this case. Try to let us know about the situation, when you can. If you will arrive later than Sunday, just do your best to get here. Try to let us know about your situation as soon as you can. We can help deal things like the hotel and may be able to suggest travel options. If you need to make flight changes, see below.","title":"If Your Arrival in Madison Is Delayed"},{"location":"logistics/travel-advice/#if-your-arrival-back-home-is-delayed","text":"If you flights back home are delayed, there is not as much that we can do. For example, it is not clear whether we can pay for changes on return flights. Contact your airline to find out how they will get you home.","title":"If Your Arrival Back Home Is Delayed"},{"location":"logistics/travel-advice/#if-you-must-make-flight-changes","text":"If one or more flights are cancelled, or if we approve flight changes and their fees in advance , you will need to make new plans with your airline. If you are at an airport, it is a good idea to get in line at your airline\u2019s service counter right away. Also, you can try calling their service number while waiting in line! For any change that requires extra payment, you must get our approval and make the change through Fox World Travel , UW\u2013Madison\u2019s only approved travel agency. If you pay for a change any other way, we cannot reimburse you. Fox World Travel phone number: +1 (844) 630-3853 Note: If you call Fox World Travel on the weekend or outside of 7am\u20137:30pm (Central), they will charge us $20 just for calling. So please use this option only when you must pay for approved flight changes. If there are significant changes to your travel plans, when you have time, please email us with your news or reach out to us on Slack.","title":"If You Must Make Flight Changes"},{"location":"logistics/travel-planning/","text":"Travel To and From Madison \u00b6 Please wait to begin making travel arrangements until we email you about it. We plan to email everyone about travel in early June, but are starting with a small group to find and fix issues. Whether we offered to pay your travel costs or not, please make sure that we get a copy of your travel plans so that we know when to expect you here and can plan accurately. (If we offered to pay for your hotel room, we will reserve your hotel room for you.) Find the numbered section below that applies to you: 1. We Offered to Pay for Your Travel \u00b6 We want to find reasonable and comfortable travel options for you. At the same time, we must stay within budget and follow University rules about arranging and paying for your travel costs. Let\u2019s work together to find something that makes sense for everyone. Here are ideas that have helped some School travelers in past years: If you are near Madison, consider driving; we can reimburse mileage and tolls up to a point, plus parking. Or look into bus routes, especially from larger cities like Chicago. The buses are very comfortable, have wi-fi, and run frequently. If you fly, try to get flights to and from Madison (MSN) itself. In some cases, we may ask you to consider flying to Milwaukee (1\u00bd hours away) or Chicago (2\u00bd hours away), then taking a direct bus to Madison; we do this only when the costs or itinerary options to Madison are terrible. If you fly, be flexible about departure times \u2014 early and late flights are often the least expensive. We do not like very early or very late flights any more than you do, so we will work hard to find reasonable flight times. Note: Please try to complete your travel plans before about July 4th, when rates may go up. Travel by Airplane \u00b6 Do NOT buy your own airline tickets . University rules say that our travel agency, Travel Incorporated, must buy your tickets. Note: The University is changing travel agencies on 1 July 2024. Please try to complete air travel arrangements by Thursday, 27 June 2024. Use the following information to get air travel tickets: In the travel email that we sent you, click the link to Travel Incorporated\u2019s \u201cUWS Traveler Booking Form\u201d (on smartsheet.com); on that form: Group Number: Copy and paste this: UWMSN061523 Traveler Type: Select \u201cGuest\u201d Concur Profile? Select \u201cNo\u201d Destination Type: Select \u201cDomestic\u201d Will a rental be needed? Select \u201cNo\u201d \u2014 we cannot pay for a rental car Are Hotel Accommodations needed? Select \u201cNo\u201d \u2014 we will arrange your hotel room separately Guest Information: Please contact us first to bring guests We must review and approve some itineraries. Travel Inc can purchase tickets directly in many cases. But if the Travel Inc agent says that your trip must be reviewed, do not worry! It just means that we need to check the budget, options, and UW rules. We hope to approve your first choice, or we will work with you and Travel Inc to find another reasonable one. Common reasons for a trip needing review are: total trip cost over $800, travel starting and ending at different locations, and travel on dates other than August 4 and 10. Approval takes time, so it may take 1\u20132 days to get confirmation. Airplane tickets cannot be held without purchase over a weekend, so avoid contacting Travel Inc late on Fridays. Please be considerate of the Travel Inc agent(s) you work with. They work hard to find good options for you, but they must also follow our rules. If you feel that they are not providing the options that you want, you should email us . We will help resolve any issues. Do not argue with the Travel Inc agents, especially about options you find online \u2014 there are many reasons why that option might not be available to us. Travel by Bus \u00b6 For some nearby locations, or in addition to air travel to Chicago or Milwaukee, it may be helpful to take a bus to Madison. Bus companies that School travelers have used often in the past are: Van Galder Bus , especially from Chicago Badger Bus , especially from Milwaukee To get bus tickets, pick one method: Ask us to buy bus tickets for you in advance. This is the easiest option all around. Just email us at school@osg-htc.org ; include your desired travel dates (tickets are not specific by time), and start and end bus stations or stops. Buy bus tickets for yourself. You may purchase bus tickets yourself before or on the day of travel. If you purchase your own tickets, you must get approval from the School for the estimated cost first, then request reimbursement from us after the School. If you purchase your own tickets, save the original receipt (even if by email). It is best to have a detailed receipt (including your name, itinerary, date of purchase, and total amount paid), but a regular ticket stub (e.g., without your name or date) should work fine. Just get what you can! Be sure to email us with your bus plans, including: Transportation provider(s) (e.g., Van Galder bus) Arrival date and approximate time Departure date and approximate time Arrival and departure location within Madison Actual or estimated cost (indicate which) Travel by Personal Car \u00b6 If you are driving to Madison, you will be reimbursed the mileage rate of $0.670 per mile for the shortest round-trip distance (as calculated by Google Maps), plus tolls. Also, we will pay for parking costs for the week at the hotel in Madison (but not elsewhere). We recommend keeping your receipts for tolls. Note: Due to the high mileage reimbursement rate, driving can be an expensive option! We reserve the right to limit your total driving reimbursement, so work with us on the details. To travel by personal car, please check with us first. We may search for comparable flight options, to make sure that driving is the least expensive method. Be sure to email us with your travel plans as soon as possible. Try to include: Departure date from home, location (for mileage calculation), and approximate time of arrival in Madison Departure date and approximate time from Madison, and return location (for mileage calculation) if different than above 2. We Are Not Paying for Your Travel \u00b6 If you are paying for your own travel or if someone else is paying for it, go ahead and make your travel arrangements now! Just remember to arrive on Sunday, August 4, before about 5:00 pm and depart on Saturday, August 10, or whatever dates we suggested directly to you. For other travel dates, check with us first, please! Be sure to email us with your travel plans as soon as possible. Try to include: Transportation provider(s) (e.g., airline) Arrival date and approximate time Departure date and approximate time Arrival and departure location within Madison (e.g., airport, bus station, etc.)","title":"Travel planning"},{"location":"logistics/travel-planning/#travel-to-and-from-madison","text":"Please wait to begin making travel arrangements until we email you about it. We plan to email everyone about travel in early June, but are starting with a small group to find and fix issues. Whether we offered to pay your travel costs or not, please make sure that we get a copy of your travel plans so that we know when to expect you here and can plan accurately. (If we offered to pay for your hotel room, we will reserve your hotel room for you.) Find the numbered section below that applies to you:","title":"Travel To and From Madison"},{"location":"logistics/travel-planning/#1-we-offered-to-pay-for-your-travel","text":"We want to find reasonable and comfortable travel options for you. At the same time, we must stay within budget and follow University rules about arranging and paying for your travel costs. Let\u2019s work together to find something that makes sense for everyone. Here are ideas that have helped some School travelers in past years: If you are near Madison, consider driving; we can reimburse mileage and tolls up to a point, plus parking. Or look into bus routes, especially from larger cities like Chicago. The buses are very comfortable, have wi-fi, and run frequently. If you fly, try to get flights to and from Madison (MSN) itself. In some cases, we may ask you to consider flying to Milwaukee (1\u00bd hours away) or Chicago (2\u00bd hours away), then taking a direct bus to Madison; we do this only when the costs or itinerary options to Madison are terrible. If you fly, be flexible about departure times \u2014 early and late flights are often the least expensive. We do not like very early or very late flights any more than you do, so we will work hard to find reasonable flight times. Note: Please try to complete your travel plans before about July 4th, when rates may go up.","title":"1. We Offered to Pay for Your Travel"},{"location":"logistics/travel-planning/#travel-by-airplane","text":"Do NOT buy your own airline tickets . University rules say that our travel agency, Travel Incorporated, must buy your tickets. Note: The University is changing travel agencies on 1 July 2024. Please try to complete air travel arrangements by Thursday, 27 June 2024. Use the following information to get air travel tickets: In the travel email that we sent you, click the link to Travel Incorporated\u2019s \u201cUWS Traveler Booking Form\u201d (on smartsheet.com); on that form: Group Number: Copy and paste this: UWMSN061523 Traveler Type: Select \u201cGuest\u201d Concur Profile? Select \u201cNo\u201d Destination Type: Select \u201cDomestic\u201d Will a rental be needed? Select \u201cNo\u201d \u2014 we cannot pay for a rental car Are Hotel Accommodations needed? Select \u201cNo\u201d \u2014 we will arrange your hotel room separately Guest Information: Please contact us first to bring guests We must review and approve some itineraries. Travel Inc can purchase tickets directly in many cases. But if the Travel Inc agent says that your trip must be reviewed, do not worry! It just means that we need to check the budget, options, and UW rules. We hope to approve your first choice, or we will work with you and Travel Inc to find another reasonable one. Common reasons for a trip needing review are: total trip cost over $800, travel starting and ending at different locations, and travel on dates other than August 4 and 10. Approval takes time, so it may take 1\u20132 days to get confirmation. Airplane tickets cannot be held without purchase over a weekend, so avoid contacting Travel Inc late on Fridays. Please be considerate of the Travel Inc agent(s) you work with. They work hard to find good options for you, but they must also follow our rules. If you feel that they are not providing the options that you want, you should email us . We will help resolve any issues. Do not argue with the Travel Inc agents, especially about options you find online \u2014 there are many reasons why that option might not be available to us.","title":"Travel by Airplane"},{"location":"logistics/travel-planning/#travel-by-bus","text":"For some nearby locations, or in addition to air travel to Chicago or Milwaukee, it may be helpful to take a bus to Madison. Bus companies that School travelers have used often in the past are: Van Galder Bus , especially from Chicago Badger Bus , especially from Milwaukee To get bus tickets, pick one method: Ask us to buy bus tickets for you in advance. This is the easiest option all around. Just email us at school@osg-htc.org ; include your desired travel dates (tickets are not specific by time), and start and end bus stations or stops. Buy bus tickets for yourself. You may purchase bus tickets yourself before or on the day of travel. If you purchase your own tickets, you must get approval from the School for the estimated cost first, then request reimbursement from us after the School. If you purchase your own tickets, save the original receipt (even if by email). It is best to have a detailed receipt (including your name, itinerary, date of purchase, and total amount paid), but a regular ticket stub (e.g., without your name or date) should work fine. Just get what you can! Be sure to email us with your bus plans, including: Transportation provider(s) (e.g., Van Galder bus) Arrival date and approximate time Departure date and approximate time Arrival and departure location within Madison Actual or estimated cost (indicate which)","title":"Travel by Bus"},{"location":"logistics/travel-planning/#travel-by-personal-car","text":"If you are driving to Madison, you will be reimbursed the mileage rate of $0.670 per mile for the shortest round-trip distance (as calculated by Google Maps), plus tolls. Also, we will pay for parking costs for the week at the hotel in Madison (but not elsewhere). We recommend keeping your receipts for tolls. Note: Due to the high mileage reimbursement rate, driving can be an expensive option! We reserve the right to limit your total driving reimbursement, so work with us on the details. To travel by personal car, please check with us first. We may search for comparable flight options, to make sure that driving is the least expensive method. Be sure to email us with your travel plans as soon as possible. Try to include: Departure date from home, location (for mileage calculation), and approximate time of arrival in Madison Departure date and approximate time from Madison, and return location (for mileage calculation) if different than above","title":"Travel by Personal Car"},{"location":"logistics/travel-planning/#2-we-are-not-paying-for-your-travel","text":"If you are paying for your own travel or if someone else is paying for it, go ahead and make your travel arrangements now! Just remember to arrive on Sunday, August 4, before about 5:00 pm and depart on Saturday, August 10, or whatever dates we suggested directly to you. For other travel dates, check with us first, please! Be sure to email us with your travel plans as soon as possible. Try to include: Transportation provider(s) (e.g., airline) Arrival date and approximate time Departure date and approximate time Arrival and departure location within Madison (e.g., airport, bus station, etc.)","title":"2. We Are Not Paying for Your Travel"},{"location":"logistics/visas/","text":"Documentation Requirements for Non-Resident Aliens \u00b6 This page is for Non-Resident Aliens only. If you are a United States citizen or permanent resident or member of the UW\u2013Madison community, this page does not apply to you. For the University of Wisconsin to pay for your travel, hotel, or meal expenses, we must have certain personal information from you. We collect as little information as possible and do not share it except with University staff who need it. Most of what we need comes from the online form you completed after accepting our invitation to attend. When you come to the School in Madison, we will need to look at and verify your travel documents. Please bring all travel documents to the School! See below for details. Tasks To Do Now \u00b6 Please check your passport and visa for travel in the United States now. Make sure that all documents are valid from now and until after the School ends. If any documents are expired or will expire before the end of the School: Tell us immediately, so that we can help you Begin the process for updating your documents immediately Do whatever you can to expedite the update process The University of Wisconsin cannot pay for or reimburse you for costs without valid travel documents. We have no control over this policy and there are no exceptions. If you are in the United States on a J-1 Scholar visa, there are extra steps needed to make the University and Federal government happy. If you have a J-1 visa and have not heard from us about it already, please email us immediately so that we can help. Documents to Bring to the School \u00b6 When you come to Madison, you must bring: Passport U.S. visa U.S. Customs and Border Protection form I-94 If you entered the U.S. before 30 April 2013, the I-94 should be stapled into your passport \u2014 do not remove it! If you entered the U.S. after 30 April 2013, the I-94 is stored electronically; you can request a copy to print from CBP If you are Canadian, you may use a second form of picture ID instead of the I-94 if you did not obtain an I-94. Additional forms specified in the table below: If you have this visa We will also need F-1 (Student) Form I-20 (original document, not a copy) J-1 (Visitor) Form DS-2019 (original document, not a copy) Visa Waiver Program Paper copy of ESTA Authorization Please bring all required information and documents to the School, especially on Tuesday, August 6. School staff will make copies of the documents and return them to you as quickly as possible. We will announce further details in class.","title":"Visa requirements"},{"location":"logistics/visas/#documentation-requirements-for-non-resident-aliens","text":"This page is for Non-Resident Aliens only. If you are a United States citizen or permanent resident or member of the UW\u2013Madison community, this page does not apply to you. For the University of Wisconsin to pay for your travel, hotel, or meal expenses, we must have certain personal information from you. We collect as little information as possible and do not share it except with University staff who need it. Most of what we need comes from the online form you completed after accepting our invitation to attend. When you come to the School in Madison, we will need to look at and verify your travel documents. Please bring all travel documents to the School! See below for details.","title":"Documentation Requirements for Non-Resident Aliens"},{"location":"logistics/visas/#tasks-to-do-now","text":"Please check your passport and visa for travel in the United States now. Make sure that all documents are valid from now and until after the School ends. If any documents are expired or will expire before the end of the School: Tell us immediately, so that we can help you Begin the process for updating your documents immediately Do whatever you can to expedite the update process The University of Wisconsin cannot pay for or reimburse you for costs without valid travel documents. We have no control over this policy and there are no exceptions. If you are in the United States on a J-1 Scholar visa, there are extra steps needed to make the University and Federal government happy. If you have a J-1 visa and have not heard from us about it already, please email us immediately so that we can help.","title":"Tasks To Do Now"},{"location":"logistics/visas/#documents-to-bring-to-the-school","text":"When you come to Madison, you must bring: Passport U.S. visa U.S. Customs and Border Protection form I-94 If you entered the U.S. before 30 April 2013, the I-94 should be stapled into your passport \u2014 do not remove it! If you entered the U.S. after 30 April 2013, the I-94 is stored electronically; you can request a copy to print from CBP If you are Canadian, you may use a second form of picture ID instead of the I-94 if you did not obtain an I-94. Additional forms specified in the table below: If you have this visa We will also need F-1 (Student) Form I-20 (original document, not a copy) J-1 (Visitor) Form DS-2019 (original document, not a copy) Visa Waiver Program Paper copy of ESTA Authorization Please bring all required information and documents to the School, especially on Tuesday, August 6. School staff will make copies of the documents and return them to you as quickly as possible. We will announce further details in class.","title":"Documents to Bring to the School"},{"location":"materials/","text":"OSG School Materials \u00b6 School Overview and Intro \u00b6 View the slides: pdf Intro to HTC and HTCondor Job Execution \u00b6 Intro to HTC Slides \u00b6 Intro to HTC: pptx Worksheet: pdf or Google Drive Intro to HTCondor Slides \u00b6 View the slides: pdf Intro Exercises 1: Running and Viewing Simple Jobs (Strongly Recommended) \u00b6 Exercise 1.1: Log in to the local submit machine and look around Exercise 1.2: Experiment with HTCondor commands Exercise 1.3: Run jobs! Exercise 1.4: Read and interpret log files Exercise 1.5: Determining Resource Needs Exercise 1.6: Remove jobs from the queue Bonus Exercises: Job Attributes and Handling \u00b6 Bonus Exercise 1.7: Compile and run some C code Bonus Exercise 1.8: Explore condor_q Bonus Exercise 1.9: Explore condor_status Intro to HTCondor Multiple Job Execution \u00b6 View the Slides: pdf Intro Exercises 2: Running Many HTC Jobs (Strongly Recommended) \u00b6 Exercise 2.1: Work with input and output files Exercise 2.2: Use queue N , $(Cluster) , and $(Process) Exercise 2.3: Use queue from with custom variables Bonus Exercise 2.4: Use queue matching with a custom variable OSG \u00b6 View the slides: pdf OSG Exercises: Comparing PATh and OSG (Strongly Recommended) \u00b6 Exercise 1.1: Log in to the OSPool Access Point Exercise 1.2: Running jobs in the OSPool Exercise 1.3: Hardware differences between PATh and OSG Exercise 1.4: Software differences in OSPool Troubleshooting \u00b6 View the Slides: pdf ppt Troubleshooting Exercises: \u00b6 Exercise 1.1: Troubleshooting Jobs Exercise 1.2: Job Retry Software \u00b6 Slides: pdf , pptx Software Exercises 1: Exploring Containers \u00b6 Exercise 1.1: Run and Explore Apptainer Containers Exercise 1.2: Use Apptainer Containers in OSPool Jobs Exercise 1.3: Use Docker Containers in OSPool Jobs Exercise 1.4: Build, Test, and Deploy an Apptainer Container Exercise 1.5: Choose Software Options Software Exercises 2: Preparing Scripts \u00b6 Exercise 2.1: Build an HTC-Friendly Executable Software Exercises 3: Container Examples (Optional) \u00b6 Exercise 3.1: Create an Apptainer Definition Files Exercise 3.2: Build Your Own Docker Container Software Exercises 4: Exploring Compiled Software (Optional) \u00b6 Exercise 4.1: Download and Use Compiled Software Exercise 4.2: Use a Wrapper Script To Run Software Exercise 4.3: Using Arguments With Wrapper Scripts Software Exercises 5: Compiled Software Examples (Optional) \u00b6 Exercise 5.1: Compiling a Research Software Exercise 5.2: Compiling Python and Running Jobs Exercise 5.3: Using Conda Environments Exercise 5.4: Compiling and Running a Simple Code Data \u00b6 View the slides: pdf Data Exercises 1: HTCondor File Transfer (Strongly Recommended) \u00b6 Exercise 1.1: Understanding a job's data needs Exercise 1.2: transfer_input_files, transfer_output_files, and remaps Exercise 1.3: Splitting input Data Exercises 2: Using OSDF (Strongly Recommended) \u00b6 Exercise 2.1: OSDF for inputs Exercise 2.2: OSDF for outputs Scaling Up \u00b6 View the slides: pptx Scaling Up Exercises \u00b6 Exercise 1.1: Organizing HTC workloads Exercise 1.2: Investigating Job Attributes Exercise 1.3: Getting Job Information from Log Files Workflows with DAGMan \u00b6 View the slides: [Slides coming soon] DAGMan Exercises 1 \u00b6 Exercise 1.1: Coordinating set of jobs: A simple DAG Exercise 1.2: A brief detour through the Mandelbrot set Exercise 1.3: A more complex DAG Exercise 1.4: Handling jobs that fail with DAGMan Exercise 1.5: Workflow Challenges Extra Topics \u00b6 Self-checkpointing for long-running jobs \u00b6 View the slides: [Slides coming soon] Exercise 1.1: Trying out self-checkpointing Special Environments \u00b6 View the slides: [Slides coming soon] Special Environments Exercises 1 \u00b6 Exercise 1.1: GPUs Introduction to Research Computing Facilitation \u00b6 View the slides: [Slides coming soon] Final Talks \u00b6 Philosophy: [Slides coming soon] Final thoughts: [Slides coming soon]","title":"Overview"},{"location":"materials/#osg-school-materials","text":"","title":"OSG School Materials"},{"location":"materials/#school-overview-and-intro","text":"View the slides: pdf","title":"School Overview and Intro"},{"location":"materials/#intro-to-htc-and-htcondor-job-execution","text":"","title":"Intro to HTC and HTCondor Job Execution"},{"location":"materials/#intro-to-htc-slides","text":"Intro to HTC: pptx Worksheet: pdf or Google Drive","title":"Intro to HTC Slides"},{"location":"materials/#intro-to-htcondor-slides","text":"View the slides: pdf","title":"Intro to HTCondor Slides"},{"location":"materials/#intro-exercises-1-running-and-viewing-simple-jobs-strongly-recommended","text":"Exercise 1.1: Log in to the local submit machine and look around Exercise 1.2: Experiment with HTCondor commands Exercise 1.3: Run jobs! Exercise 1.4: Read and interpret log files Exercise 1.5: Determining Resource Needs Exercise 1.6: Remove jobs from the queue","title":"Intro Exercises 1: Running and Viewing Simple Jobs (Strongly Recommended)"},{"location":"materials/#bonus-exercises-job-attributes-and-handling","text":"Bonus Exercise 1.7: Compile and run some C code Bonus Exercise 1.8: Explore condor_q Bonus Exercise 1.9: Explore condor_status","title":"Bonus Exercises: Job Attributes and Handling"},{"location":"materials/#intro-to-htcondor-multiple-job-execution","text":"View the Slides: pdf","title":"Intro to HTCondor Multiple Job Execution"},{"location":"materials/#intro-exercises-2-running-many-htc-jobs-strongly-recommended","text":"Exercise 2.1: Work with input and output files Exercise 2.2: Use queue N , $(Cluster) , and $(Process) Exercise 2.3: Use queue from with custom variables Bonus Exercise 2.4: Use queue matching with a custom variable","title":"Intro Exercises 2: Running Many HTC Jobs (Strongly Recommended)"},{"location":"materials/#osg","text":"View the slides: pdf","title":"OSG"},{"location":"materials/#osg-exercises-comparing-path-and-osg-strongly-recommended","text":"Exercise 1.1: Log in to the OSPool Access Point Exercise 1.2: Running jobs in the OSPool Exercise 1.3: Hardware differences between PATh and OSG Exercise 1.4: Software differences in OSPool","title":"OSG Exercises: Comparing PATh and OSG (Strongly Recommended)"},{"location":"materials/#troubleshooting","text":"View the Slides: pdf ppt","title":"Troubleshooting"},{"location":"materials/#troubleshooting-exercises","text":"Exercise 1.1: Troubleshooting Jobs Exercise 1.2: Job Retry","title":"Troubleshooting Exercises:"},{"location":"materials/#software","text":"Slides: pdf , pptx","title":"Software"},{"location":"materials/#software-exercises-1-exploring-containers","text":"Exercise 1.1: Run and Explore Apptainer Containers Exercise 1.2: Use Apptainer Containers in OSPool Jobs Exercise 1.3: Use Docker Containers in OSPool Jobs Exercise 1.4: Build, Test, and Deploy an Apptainer Container Exercise 1.5: Choose Software Options","title":"Software Exercises 1: Exploring Containers"},{"location":"materials/#software-exercises-2-preparing-scripts","text":"Exercise 2.1: Build an HTC-Friendly Executable","title":"Software Exercises 2: Preparing Scripts"},{"location":"materials/#software-exercises-3-container-examples-optional","text":"Exercise 3.1: Create an Apptainer Definition Files Exercise 3.2: Build Your Own Docker Container","title":"Software Exercises 3: Container Examples (Optional)"},{"location":"materials/#software-exercises-4-exploring-compiled-software-optional","text":"Exercise 4.1: Download and Use Compiled Software Exercise 4.2: Use a Wrapper Script To Run Software Exercise 4.3: Using Arguments With Wrapper Scripts","title":"Software Exercises 4: Exploring Compiled Software (Optional)"},{"location":"materials/#software-exercises-5-compiled-software-examples-optional","text":"Exercise 5.1: Compiling a Research Software Exercise 5.2: Compiling Python and Running Jobs Exercise 5.3: Using Conda Environments Exercise 5.4: Compiling and Running a Simple Code","title":"Software Exercises 5: Compiled Software Examples (Optional)"},{"location":"materials/#data","text":"View the slides: pdf","title":"Data"},{"location":"materials/#data-exercises-1-htcondor-file-transfer-strongly-recommended","text":"Exercise 1.1: Understanding a job's data needs Exercise 1.2: transfer_input_files, transfer_output_files, and remaps Exercise 1.3: Splitting input","title":"Data Exercises 1: HTCondor File Transfer (Strongly Recommended)"},{"location":"materials/#data-exercises-2-using-osdf-strongly-recommended","text":"Exercise 2.1: OSDF for inputs Exercise 2.2: OSDF for outputs","title":"Data Exercises 2: Using OSDF (Strongly Recommended)"},{"location":"materials/#scaling-up","text":"View the slides: pptx","title":"Scaling Up"},{"location":"materials/#scaling-up-exercises","text":"Exercise 1.1: Organizing HTC workloads Exercise 1.2: Investigating Job Attributes Exercise 1.3: Getting Job Information from Log Files","title":"Scaling Up Exercises"},{"location":"materials/#workflows-with-dagman","text":"View the slides: [Slides coming soon]","title":"Workflows with DAGMan"},{"location":"materials/#dagman-exercises-1","text":"Exercise 1.1: Coordinating set of jobs: A simple DAG Exercise 1.2: A brief detour through the Mandelbrot set Exercise 1.3: A more complex DAG Exercise 1.4: Handling jobs that fail with DAGMan Exercise 1.5: Workflow Challenges","title":"DAGMan Exercises 1"},{"location":"materials/#extra-topics","text":"","title":"Extra Topics"},{"location":"materials/#self-checkpointing-for-long-running-jobs","text":"View the slides: [Slides coming soon] Exercise 1.1: Trying out self-checkpointing","title":"Self-checkpointing for long-running jobs"},{"location":"materials/#special-environments","text":"View the slides: [Slides coming soon]","title":"Special Environments"},{"location":"materials/#special-environments-exercises-1","text":"Exercise 1.1: GPUs","title":"Special Environments Exercises 1"},{"location":"materials/#introduction-to-research-computing-facilitation","text":"View the slides: [Slides coming soon]","title":"Introduction to Research Computing Facilitation"},{"location":"materials/#final-talks","text":"Philosophy: [Slides coming soon] Final thoughts: [Slides coming soon]","title":"Final Talks"},{"location":"materials/checkpoint/part1-ex1-checkpointing/","text":"Self-Checkpointing Exercise 1.1: Trying It Out \u00b6 The goal of this exercise is to practice writing a submit file for self-checkpointing, and to see the process in action. Calculating Fibonacci numbers \u2026 slowly \u00b6 The sample code for this exercise calculates the Fibonacci number resulting from a given set of iterations. Because this is a trival computation, the code includes a delay in each iteration through the main loop; this simulates a more intensive computation. To get set up: Log in to ap40.uw.osg-htc.org ( ap1 is fine, too) Create and change into a new directory for this exercise Download the Python script that is the main executable for this exercise: user@server $ wget http://proxy.chtc.wisc.edu/SQUID/osg-school-2024/fibonacci.py If you want to run the script directly, make it executable first: user@server $ chmod 0755 fibonacci.py Take a look at the code, if you like. It is not very elegant, but it gets the job done. A few notes: The script takes a single argument, the number of iterations to run. To minimize computing time while leaving time to explore, 10 is a good number of iterations. The script checkpoints every other iteration through the main loop. The exit status code for a checkpoint is 85. It prints some output to standard out along the way, to let you know what is going on. The final result is written to a separate file named fibonacci.result . This file does not exist until the very end of the complete run. It is safe to run from the command line on an access point: user@server $ ./fibonacci.py 10 If you run it, what happens? (Due to the 30-second delay, be patient.) Can you explain its behavior? What happens if you run it again, without changing any files in between? Why? Preparing to run \u00b6 Now you have an executable and you know how to run it. It is time to prepare it for submission to HTCondor! Using what you know about the script (above), and using information in the slides from today, try writing a submit file that runs this software and implements exit-driven self-checkpointing. The Python code itself is ready and should not need any changes. Just use a plain queue statement, one job is enough to experiment on. Before you submit, read the next section first! Running and monitoring \u00b6 With the 30-second delay per iteration in the code and the suggested 10 iterations, once the script starts running you have about 5 minutes of runtime in which to see what is going on. So it may help to read through this section and then return here and submit your job. If your job has problems or finishes before you have the chance to do all the steps below, just remove the extra files (besides the Python script and your submit file) and try again! Submission and first checkpoint \u00b6 Submit the job Look at the contents of the submit directory \u2014 what changed? Start watching the log file: tail -n 100 -f YOUR-LOG-FILENAME.log Be patient! As HTCondor adds more lines to the end of your log file, they will appear automatically. Thus, nothing much will happen until HTCondor starts running your job. When it does, you will see three sets of messages in the log file quickly: Started transferring input files Finished transferring input files Job executing on host: (Of course, each message will contain a lot of other characters!) Now wait about 1 minute, and you should see two more messages appear: Started transferring output files Finished transferring output files That is the first checkpoint happening! Forcing your job to stop running \u00b6 Now, assuming that your job is still running (check condor_q again), you can force HTCondor to remove ( evict ) your job before it finishes: Run condor_q to get the job ID of the running job Run condor_vacate_job JOB_ID , where you replace JOB_ID with your job ID from above Monitor the action again by running tail -n 100 -f YOUR-LOG-FILENAME.log Finishing the job and wrap-up \u00b6 Be patient again! You removed your running job, and so HTCondor put it back in the queue as idle. If you wait a minute or two, you should see that HTCondor starts running the job again. In the log file, look carefully for the two Job executing on host: messages. Does it seem like you ran on the same computer again or on a different one? Both are possible! Let your job finish running this time. There should be a Job terminated of its own accord message near the end. Did you get results? Go through all the files and see what they contain. The log and output files are probably the most interesting. But did you get a result file, too? Did the output file \u2014 that is, whatever file you named in the output line of your submit file \u2014 contain everything that you expected it to? Conclusion \u00b6 This has been a brief and simple tour of self-checkpointing. If you would like to learn more, please read the Self-Checkpointing Applications section of the HTCondor Manual. Or talk to School staff about it. Or contact support@osg-htc.org for further help at any time.","title":"1.1 - Trying out self-checkpointing"},{"location":"materials/checkpoint/part1-ex1-checkpointing/#self-checkpointing-exercise-11-trying-it-out","text":"The goal of this exercise is to practice writing a submit file for self-checkpointing, and to see the process in action.","title":"Self-Checkpointing Exercise 1.1: Trying It Out"},{"location":"materials/checkpoint/part1-ex1-checkpointing/#calculating-fibonacci-numbers-slowly","text":"The sample code for this exercise calculates the Fibonacci number resulting from a given set of iterations. Because this is a trival computation, the code includes a delay in each iteration through the main loop; this simulates a more intensive computation. To get set up: Log in to ap40.uw.osg-htc.org ( ap1 is fine, too) Create and change into a new directory for this exercise Download the Python script that is the main executable for this exercise: user@server $ wget http://proxy.chtc.wisc.edu/SQUID/osg-school-2024/fibonacci.py If you want to run the script directly, make it executable first: user@server $ chmod 0755 fibonacci.py Take a look at the code, if you like. It is not very elegant, but it gets the job done. A few notes: The script takes a single argument, the number of iterations to run. To minimize computing time while leaving time to explore, 10 is a good number of iterations. The script checkpoints every other iteration through the main loop. The exit status code for a checkpoint is 85. It prints some output to standard out along the way, to let you know what is going on. The final result is written to a separate file named fibonacci.result . This file does not exist until the very end of the complete run. It is safe to run from the command line on an access point: user@server $ ./fibonacci.py 10 If you run it, what happens? (Due to the 30-second delay, be patient.) Can you explain its behavior? What happens if you run it again, without changing any files in between? Why?","title":"Calculating Fibonacci numbers … slowly"},{"location":"materials/checkpoint/part1-ex1-checkpointing/#preparing-to-run","text":"Now you have an executable and you know how to run it. It is time to prepare it for submission to HTCondor! Using what you know about the script (above), and using information in the slides from today, try writing a submit file that runs this software and implements exit-driven self-checkpointing. The Python code itself is ready and should not need any changes. Just use a plain queue statement, one job is enough to experiment on. Before you submit, read the next section first!","title":"Preparing to run"},{"location":"materials/checkpoint/part1-ex1-checkpointing/#running-and-monitoring","text":"With the 30-second delay per iteration in the code and the suggested 10 iterations, once the script starts running you have about 5 minutes of runtime in which to see what is going on. So it may help to read through this section and then return here and submit your job. If your job has problems or finishes before you have the chance to do all the steps below, just remove the extra files (besides the Python script and your submit file) and try again!","title":"Running and monitoring"},{"location":"materials/checkpoint/part1-ex1-checkpointing/#submission-and-first-checkpoint","text":"Submit the job Look at the contents of the submit directory \u2014 what changed? Start watching the log file: tail -n 100 -f YOUR-LOG-FILENAME.log Be patient! As HTCondor adds more lines to the end of your log file, they will appear automatically. Thus, nothing much will happen until HTCondor starts running your job. When it does, you will see three sets of messages in the log file quickly: Started transferring input files Finished transferring input files Job executing on host: (Of course, each message will contain a lot of other characters!) Now wait about 1 minute, and you should see two more messages appear: Started transferring output files Finished transferring output files That is the first checkpoint happening!","title":"Submission and first checkpoint"},{"location":"materials/checkpoint/part1-ex1-checkpointing/#forcing-your-job-to-stop-running","text":"Now, assuming that your job is still running (check condor_q again), you can force HTCondor to remove ( evict ) your job before it finishes: Run condor_q to get the job ID of the running job Run condor_vacate_job JOB_ID , where you replace JOB_ID with your job ID from above Monitor the action again by running tail -n 100 -f YOUR-LOG-FILENAME.log","title":"Forcing your job to stop running"},{"location":"materials/checkpoint/part1-ex1-checkpointing/#finishing-the-job-and-wrap-up","text":"Be patient again! You removed your running job, and so HTCondor put it back in the queue as idle. If you wait a minute or two, you should see that HTCondor starts running the job again. In the log file, look carefully for the two Job executing on host: messages. Does it seem like you ran on the same computer again or on a different one? Both are possible! Let your job finish running this time. There should be a Job terminated of its own accord message near the end. Did you get results? Go through all the files and see what they contain. The log and output files are probably the most interesting. But did you get a result file, too? Did the output file \u2014 that is, whatever file you named in the output line of your submit file \u2014 contain everything that you expected it to?","title":"Finishing the job and wrap-up"},{"location":"materials/checkpoint/part1-ex1-checkpointing/#conclusion","text":"This has been a brief and simple tour of self-checkpointing. If you would like to learn more, please read the Self-Checkpointing Applications section of the HTCondor Manual. Or talk to School staff about it. Or contact support@osg-htc.org for further help at any time.","title":"Conclusion"},{"location":"materials/data/part1-ex1-data-needs/","text":"Data Exercise 1.1: Understanding Data Requirements \u00b6 Exercise Goal \u00b6 This exercise's goal is to learn to think critically about an application's data needs, especially before submitting a large batch of jobs or using tools for delivering large data to jobs. In this exercise we will attempt to understand the input and output of the bioinformatics application BLAST . Setup \u00b6 For this exercise, we will use the ap40.uw.osg-htc.org access point. Log in: $ ssh @ap40.uw.osg-htc.org Create a directory for this exercise named blast-data and change into it Copy the Input Files \u00b6 To run BLAST, we need the executable, input file, and reference database. For this example, we'll use the \"pdbaa\" database, which contains sequences for the protein structure from the Protein Data Bank. For our input file, we'll use an abbreviated fasta file with mouse genome information. Copy the BLAST executables: user@ap40 $ wget http://proxy.chtc.wisc.edu/SQUID/osg-school-2024/ncbi-blast-2.12.0+-x64-linux.tar.gz user@ap40 $ tar -xzvf ncbi-blast-2.12.0+-x64-linux.tar.gz Download these files to your current directory: user@ap40 $ wget http://proxy.chtc.wisc.edu/SQUID/osg-school-2024/pdbaa.tar.gz user@ap40 $ wget http://proxy.chtc.wisc.edu/SQUID/osg-school-2024/mouse.fa Untar the pdbaa database: user@ap40 $ tar -xzvf pdbaa.tar.gz Understanding BLAST \u00b6 Remember that blastx is executed in a command like the following: user@ap40 $ ./ncbi-blast-2.12.0+/bin/blastx -db -query -out In the above, the is the name of a file containing a number of genetic sequences (e.g. mouse.fa ), and the database that these are compared against is made up of several files that begin with the same , (e.g. pdbaa/pdbaa ). The output from this analysis will be printed to that is also indicated in the command. Calculating Data Needs \u00b6 Using the files that you prepared in blast-data , we will calculate how much disk space is needed if we were to run a hypothetical BLAST job with a wrapper script, where the job: Transfers all of its input files (including the executable) as tarballs Untars the input files tarballs on the execute host Runs blastx using the untarred input files Here are some commands that will be useful for calculating your job's storage needs: List the size of a specific file: user@ap40 $ ls -lh List the sizes of all files in the current directory: user@ap40 $ ls -lh Sum the size of all files in a specific directory: user@ap40 $ du -sh Input requirements \u00b6 Total up the amount of data in all of the files necessary to run the blastx wrapper job, including the executable itself. Write down this number. Also take note of how much total data is in the pdbaa directory. Compressed Files Remember, blastx reads the un-compressed pdbaa files. Output requirements \u00b6 The output that we care about from blastx is saved in the file whose name is indicated after the -out argument to blastx . Also, remember that HTCondor also creates the error, output, and log files, which you'll need to add up, too. Are there any other files? Total all of these together, as well. Up next! \u00b6 Next you will create a HTCondor submit script to transfer the Blast input files in order to run Blast on a worker nodes. Next Exercise","title":"1.1 - Understanding a job's data needs"},{"location":"materials/data/part1-ex1-data-needs/#data-exercise-11-understanding-data-requirements","text":"","title":"Data Exercise 1.1: Understanding Data Requirements"},{"location":"materials/data/part1-ex1-data-needs/#exercise-goal","text":"This exercise's goal is to learn to think critically about an application's data needs, especially before submitting a large batch of jobs or using tools for delivering large data to jobs. In this exercise we will attempt to understand the input and output of the bioinformatics application BLAST .","title":"Exercise Goal"},{"location":"materials/data/part1-ex1-data-needs/#setup","text":"For this exercise, we will use the ap40.uw.osg-htc.org access point. Log in: $ ssh @ap40.uw.osg-htc.org Create a directory for this exercise named blast-data and change into it","title":"Setup"},{"location":"materials/data/part1-ex1-data-needs/#copy-the-input-files","text":"To run BLAST, we need the executable, input file, and reference database. For this example, we'll use the \"pdbaa\" database, which contains sequences for the protein structure from the Protein Data Bank. For our input file, we'll use an abbreviated fasta file with mouse genome information. Copy the BLAST executables: user@ap40 $ wget http://proxy.chtc.wisc.edu/SQUID/osg-school-2024/ncbi-blast-2.12.0+-x64-linux.tar.gz user@ap40 $ tar -xzvf ncbi-blast-2.12.0+-x64-linux.tar.gz Download these files to your current directory: user@ap40 $ wget http://proxy.chtc.wisc.edu/SQUID/osg-school-2024/pdbaa.tar.gz user@ap40 $ wget http://proxy.chtc.wisc.edu/SQUID/osg-school-2024/mouse.fa Untar the pdbaa database: user@ap40 $ tar -xzvf pdbaa.tar.gz","title":"Copy the Input Files"},{"location":"materials/data/part1-ex1-data-needs/#understanding-blast","text":"Remember that blastx is executed in a command like the following: user@ap40 $ ./ncbi-blast-2.12.0+/bin/blastx -db -query -out In the above, the is the name of a file containing a number of genetic sequences (e.g. mouse.fa ), and the database that these are compared against is made up of several files that begin with the same , (e.g. pdbaa/pdbaa ). The output from this analysis will be printed to that is also indicated in the command.","title":"Understanding BLAST"},{"location":"materials/data/part1-ex1-data-needs/#calculating-data-needs","text":"Using the files that you prepared in blast-data , we will calculate how much disk space is needed if we were to run a hypothetical BLAST job with a wrapper script, where the job: Transfers all of its input files (including the executable) as tarballs Untars the input files tarballs on the execute host Runs blastx using the untarred input files Here are some commands that will be useful for calculating your job's storage needs: List the size of a specific file: user@ap40 $ ls -lh List the sizes of all files in the current directory: user@ap40 $ ls -lh Sum the size of all files in a specific directory: user@ap40 $ du -sh ","title":"Calculating Data Needs"},{"location":"materials/data/part1-ex1-data-needs/#input-requirements","text":"Total up the amount of data in all of the files necessary to run the blastx wrapper job, including the executable itself. Write down this number. Also take note of how much total data is in the pdbaa directory. Compressed Files Remember, blastx reads the un-compressed pdbaa files.","title":"Input requirements"},{"location":"materials/data/part1-ex1-data-needs/#output-requirements","text":"The output that we care about from blastx is saved in the file whose name is indicated after the -out argument to blastx . Also, remember that HTCondor also creates the error, output, and log files, which you'll need to add up, too. Are there any other files? Total all of these together, as well.","title":"Output requirements"},{"location":"materials/data/part1-ex1-data-needs/#up-next","text":"Next you will create a HTCondor submit script to transfer the Blast input files in order to run Blast on a worker nodes. Next Exercise","title":"Up next!"},{"location":"materials/data/part1-ex2-file-transfer/","text":"Data Exercise 1.2: transfer_input_files, transfer_output_files, and remaps \u00b6 Exercise Goal \u00b6 The objective of this exercise is to refresh yourself on HTCondor file transfer, to implement file compression, and to begin examining the memory and disk space used by your jobs in order to plan larger batches. We will also explore ways to deal with output data. Setup \u00b6 The executable we'll use in this exercise and later today is the same blastx executable from previous exercises. Log in to ap40: $ ssh @ap40.uw.osg-htc.org Then change into the blast-data folder that you created in the previous exercise. Review: HTCondor File Transfer \u00b6 Recall that OSG does NOT have a shared filesystem! Instead, HTCondor transfers your executable and input files (specified with the executable and transfer_input_files submit file directives, respectively) to a working directory on the execute node, regardless of how these files were arranged on the submit node. In this exercise we'll use the same blastx example job that we used previously, but modify the submit file and test how much memory and disk space it uses on the execute node. Start with a test submit file \u00b6 We've started a submit file for you, below, which you'll add to in the remaining steps. executable = transfer_input_files = output = test.out error = test.err log = test.log request_memory = request_disk = request_cpus = 1 requirements = (OSGVO_OS_STRING == \"RHEL 9\") queue Implement file compression \u00b6 In our first blast job from the Software exercises ( 1.1 ), the database files in the pdbaa directory were all transferred, as is, but we could instead transfer them as a single, compressed file using tar . For this version of the job, let's compress our blast database files to send them to the submit node as a single tar.gz file (otherwise known as a tarball), by following the below steps: Change into the pdbaa directory and compress the database files into a single file called pdbaa_files.tar.gz using the tar command. Note that this file will be different from the pdbaa.tar.gz file that you used earlier, because it will only contain the pdbaa files, and not the pdbaa directory, itself.) Remember, a typical command for creating a tar file is: user@ap40 $ tar -cvzf Replacing with the name of the tarball that you would like to create and with a space-separated list of files and/or directories that you want inside pdbaa_files.tar.gz. Move the resulting tarball to the blast-data directory. Create a wrapper script that will first decompress the pdbaa_files.tar.gz file, and then run blast. Because this file will now be our executable in the submit file, we'll also end up transferring the blastx executable with transfer_input_files . In the blast-data directory, create a new file, called blast_wrapper.sh , with the following contents: #!/bin/bash tar -xzvf pdbaa_files.tar.gz ./blastx -db pdbaa -query mouse.fa -out mouse.fa.result rm pdbaa.* Also remember to make the script executable: chmod +x blast_wrapper.sh Extra Files! The last line removes the resulting database files that came from pdbaa_files.tar.gz , as these files would otherwise be copied back to the submit server as perceived output since they're \"new\" files that HTCondor didn't transfer over as input. List the executable and input files \u00b6 Make sure to update the submit file with the following: Add the new executable (the wrapper script you created above) In transfer_input_files , list the blastx binary, the pdbaa_files.tar.gz file, and the input query file. Commas, commas everywhere! Remember that transfer_input_files accepts a comma separated list of files, and that you need to list the full location of the blastx executable ( blastx ). There will be no arguments, since the arguments to the blastx command are now captured in the wrapper script. Predict memory and disk requests from your data \u00b6 Also, think about how much memory and disk to request for this job. It's good to start with values that are a little higher than you think a test job will need, but think about: How much memory blastx would use if it loaded all of the database files and the query input file into memory. How much disk space will be necessary on the execute server for the executable, all input files, and all output files (hint: the log file only exists on the submit node). Whether you'd like to request some extra memory or disk space, just in case Look at the log file for your blastx job from Software exercise ( 1.1 ), and compare the memory and disk \"Usage\" to what you predicted from the files. Make sure to update the submit file with more accurate memory and disk requests. You may still want to request slightly more than the job actually used. Run the test job \u00b6 Once you have finished editing the submit file, go ahead and submit the job. It should take a few minutes to complete, and then you can check to make sure that no unwanted files (especially the pdbaa database files) were copied back at the end of the job. Run a du -sh on the directory with this job's input. How does it compare to the directory from Software exercise ( 1.1 ), and why? transfer_output_files \u00b6 So far, we have used HTCondor's new file detection to transfer back the newly created files. An alternative is to be explicit, using the transfer_output_files attribute in the submit file. The upside to this approach is that you can pick to only transfer back a subset of the created files. The downside is that you have to know which files are created. The first exercise is to modify the submit file from the previous example, and add a line like (remember, before the queue ): transfer_output_files = mouse.fa.result You may also remove the last line in the blast_wrapper.sh , the rm pdbaa.* as extra files are no longer an issue - those files will be ignored because we used transfer_output_files . Submit the job, and make sure everything works. Did you get any pdbaa.* files back? The next thing we should try is to see what happens if the file we specify does not exist. Modify your submit file, and change the transfer_output_files to: transfer_output_files = elephant.fa.result Submit the job and see how it behaves. Did it finish successfully? transfer_output_remaps \u00b6 Related to transfer_output_files is transfer_output_remaps , which allows us to rename outputs, or map the outputs to a different storage system (will be explored in the next module). The format of the transfer_output_remaps attribute is a list of remaps, each remap taking the form of src=dst . The destination can be a local path, or a URL. For example: transfer_output_remaps = \"myresults.dat = s3://destination-server.com/myresults.dat\" If you have more than one remap, you can separate them with ; By now, your blast-data directory is probably starting to look messy with a mix of submit files, input data, log file and output data all intermingled. One improvement could be to map our outputs to a separate directory. Create a new directory named science-results . Add a transfer_output_remaps line to the submit file. It is common to place this line right after the transfer_output_files line. Change the transfer_output_files back to mouse.fa.result . Example: transfer_output_files = mouse.fa.result transfer_output_remaps = Fill out the remap line, mapping mouse.fa.result to the destination science-results/mouse.fa.result . Remember that the transfer_output_remaps value requires double quotes around it. Submit the job, and wait for it to complete. Are there any errors? Can you find mouse.fa.result? Conclusions \u00b6 In this exercise, you: Used your data requirements knowledge from the previous exercise to write a job. Executed the job on a remote worker node and took note of the data usage. Used transfer_input_files to transfer inputs Used transfer_output_files to transfer outputs Used transfer_output_remaps to map outputs to a different destination When you've completed the above, continue with the next exercise .","title":"1.2 - transfer_input_files, transfer_output_files, and remaps"},{"location":"materials/data/part1-ex2-file-transfer/#data-exercise-12-transfer_input_files-transfer_output_files-and-remaps","text":"","title":"Data Exercise 1.2: transfer_input_files, transfer_output_files, and remaps"},{"location":"materials/data/part1-ex2-file-transfer/#exercise-goal","text":"The objective of this exercise is to refresh yourself on HTCondor file transfer, to implement file compression, and to begin examining the memory and disk space used by your jobs in order to plan larger batches. We will also explore ways to deal with output data.","title":"Exercise Goal"},{"location":"materials/data/part1-ex2-file-transfer/#setup","text":"The executable we'll use in this exercise and later today is the same blastx executable from previous exercises. Log in to ap40: $ ssh @ap40.uw.osg-htc.org Then change into the blast-data folder that you created in the previous exercise.","title":"Setup"},{"location":"materials/data/part1-ex2-file-transfer/#review-htcondor-file-transfer","text":"Recall that OSG does NOT have a shared filesystem! Instead, HTCondor transfers your executable and input files (specified with the executable and transfer_input_files submit file directives, respectively) to a working directory on the execute node, regardless of how these files were arranged on the submit node. In this exercise we'll use the same blastx example job that we used previously, but modify the submit file and test how much memory and disk space it uses on the execute node.","title":"Review: HTCondor File Transfer"},{"location":"materials/data/part1-ex2-file-transfer/#start-with-a-test-submit-file","text":"We've started a submit file for you, below, which you'll add to in the remaining steps. executable = transfer_input_files = output = test.out error = test.err log = test.log request_memory = request_disk = request_cpus = 1 requirements = (OSGVO_OS_STRING == \"RHEL 9\") queue","title":"Start with a test submit file"},{"location":"materials/data/part1-ex2-file-transfer/#implement-file-compression","text":"In our first blast job from the Software exercises ( 1.1 ), the database files in the pdbaa directory were all transferred, as is, but we could instead transfer them as a single, compressed file using tar . For this version of the job, let's compress our blast database files to send them to the submit node as a single tar.gz file (otherwise known as a tarball), by following the below steps: Change into the pdbaa directory and compress the database files into a single file called pdbaa_files.tar.gz using the tar command. Note that this file will be different from the pdbaa.tar.gz file that you used earlier, because it will only contain the pdbaa files, and not the pdbaa directory, itself.) Remember, a typical command for creating a tar file is: user@ap40 $ tar -cvzf Replacing with the name of the tarball that you would like to create and with a space-separated list of files and/or directories that you want inside pdbaa_files.tar.gz. Move the resulting tarball to the blast-data directory. Create a wrapper script that will first decompress the pdbaa_files.tar.gz file, and then run blast. Because this file will now be our executable in the submit file, we'll also end up transferring the blastx executable with transfer_input_files . In the blast-data directory, create a new file, called blast_wrapper.sh , with the following contents: #!/bin/bash tar -xzvf pdbaa_files.tar.gz ./blastx -db pdbaa -query mouse.fa -out mouse.fa.result rm pdbaa.* Also remember to make the script executable: chmod +x blast_wrapper.sh Extra Files! The last line removes the resulting database files that came from pdbaa_files.tar.gz , as these files would otherwise be copied back to the submit server as perceived output since they're \"new\" files that HTCondor didn't transfer over as input.","title":"Implement file compression"},{"location":"materials/data/part1-ex2-file-transfer/#list-the-executable-and-input-files","text":"Make sure to update the submit file with the following: Add the new executable (the wrapper script you created above) In transfer_input_files , list the blastx binary, the pdbaa_files.tar.gz file, and the input query file. Commas, commas everywhere! Remember that transfer_input_files accepts a comma separated list of files, and that you need to list the full location of the blastx executable ( blastx ). There will be no arguments, since the arguments to the blastx command are now captured in the wrapper script.","title":"List the executable and input files"},{"location":"materials/data/part1-ex2-file-transfer/#predict-memory-and-disk-requests-from-your-data","text":"Also, think about how much memory and disk to request for this job. It's good to start with values that are a little higher than you think a test job will need, but think about: How much memory blastx would use if it loaded all of the database files and the query input file into memory. How much disk space will be necessary on the execute server for the executable, all input files, and all output files (hint: the log file only exists on the submit node). Whether you'd like to request some extra memory or disk space, just in case Look at the log file for your blastx job from Software exercise ( 1.1 ), and compare the memory and disk \"Usage\" to what you predicted from the files. Make sure to update the submit file with more accurate memory and disk requests. You may still want to request slightly more than the job actually used.","title":"Predict memory and disk requests from your data"},{"location":"materials/data/part1-ex2-file-transfer/#run-the-test-job","text":"Once you have finished editing the submit file, go ahead and submit the job. It should take a few minutes to complete, and then you can check to make sure that no unwanted files (especially the pdbaa database files) were copied back at the end of the job. Run a du -sh on the directory with this job's input. How does it compare to the directory from Software exercise ( 1.1 ), and why?","title":"Run the test job"},{"location":"materials/data/part1-ex2-file-transfer/#transfer_output_files","text":"So far, we have used HTCondor's new file detection to transfer back the newly created files. An alternative is to be explicit, using the transfer_output_files attribute in the submit file. The upside to this approach is that you can pick to only transfer back a subset of the created files. The downside is that you have to know which files are created. The first exercise is to modify the submit file from the previous example, and add a line like (remember, before the queue ): transfer_output_files = mouse.fa.result You may also remove the last line in the blast_wrapper.sh , the rm pdbaa.* as extra files are no longer an issue - those files will be ignored because we used transfer_output_files . Submit the job, and make sure everything works. Did you get any pdbaa.* files back? The next thing we should try is to see what happens if the file we specify does not exist. Modify your submit file, and change the transfer_output_files to: transfer_output_files = elephant.fa.result Submit the job and see how it behaves. Did it finish successfully?","title":"transfer_output_files"},{"location":"materials/data/part1-ex2-file-transfer/#transfer_output_remaps","text":"Related to transfer_output_files is transfer_output_remaps , which allows us to rename outputs, or map the outputs to a different storage system (will be explored in the next module). The format of the transfer_output_remaps attribute is a list of remaps, each remap taking the form of src=dst . The destination can be a local path, or a URL. For example: transfer_output_remaps = \"myresults.dat = s3://destination-server.com/myresults.dat\" If you have more than one remap, you can separate them with ; By now, your blast-data directory is probably starting to look messy with a mix of submit files, input data, log file and output data all intermingled. One improvement could be to map our outputs to a separate directory. Create a new directory named science-results . Add a transfer_output_remaps line to the submit file. It is common to place this line right after the transfer_output_files line. Change the transfer_output_files back to mouse.fa.result . Example: transfer_output_files = mouse.fa.result transfer_output_remaps = Fill out the remap line, mapping mouse.fa.result to the destination science-results/mouse.fa.result . Remember that the transfer_output_remaps value requires double quotes around it. Submit the job, and wait for it to complete. Are there any errors? Can you find mouse.fa.result?","title":"transfer_output_remaps"},{"location":"materials/data/part1-ex2-file-transfer/#conclusions","text":"In this exercise, you: Used your data requirements knowledge from the previous exercise to write a job. Executed the job on a remote worker node and took note of the data usage. Used transfer_input_files to transfer inputs Used transfer_output_files to transfer outputs Used transfer_output_remaps to map outputs to a different destination When you've completed the above, continue with the next exercise .","title":"Conclusions"},{"location":"materials/data/part1-ex3-blast-split/","text":"Data Exercise 1.3: Splitting Large Input for Better Throughput \u00b6 The objective of this exercise is to prepare for blasting a much larger input query file by splitting the input for greater throughput and lower memory and disk requirements. Splitting the input will also mean that we don't have to rely on additional large-data measures for the input query files. Setup \u00b6 Log in to ap40.uw.osg-htc.org Create a directory for this exercise named blast-split and change into it. Copy over the following files from the previous exercise : Your submit file blastx pdbaa_files.tar.gz blast_wrapper.sh Remember to modify the submit file for the new locations of the above files. Obtain the large input \u00b6 We've previously used blastx to analyze a relatively small input file of test data, mouse.fa , but let's imagine that you now need to blast a much larger dataset for your research. This dataset can be downloaded with the following command: user@ap40 $ wget http://proxy.chtc.wisc.edu/SQUID/osg-school-2024/mouse_rna.tar.gz After un-tar'ing ( tar xzf mouse_rna.tar.gz ) the file, you should be able to confirm that it's size is roughly 100 MB. Not only is this near the size cutoff for HTCondor file transfer, it would take hours to complete a single blastx analysis for it and the resulting output file would be huge. Split the input file \u00b6 For blast , it's scientifically valid to split up the input query file, analyze the pieces, and then put the results back together at the end! On the other hand, BLAST databases should not be split, because the blast output includes a score value for each sequence that is calculated relative to the entire length of the database. Because genetic sequence data is used heavily across the life sciences, there are also tools for splitting up the data into smaller files. One of these is called genome tools , and you can download a package of precompiled binaries (just like BLAST) using the following command: user@ap40 $ wget http://proxy.chtc.wisc.edu/SQUID/osg-school-2024/gt-1.5.10-Linux_x86_64-64bit-complete.tar.gz Un-tar the gt package ( tar -xzvf ... ), then run its sequence file splitter as follows, with the target file size of 1MB: user@ap40 $ ./gt-1.5.10-Linux_x86_64-64bit-complete/bin/gt splitfasta -targetsize 1 mouse_rna.fa You'll notice that the result is a set of 100 files, all about the size of 1 MB, and numbered 1 through 100. Run a Jobs on Split Input \u00b6 Now, you'll submit jobs on the split input files, where each job will use a different piece of the large original input file. Modify the submit file \u00b6 First, you'll create a new submit file that passes the input filename as an argument and use a list of applicable filenames. Follow the below steps: Copy the submit file from the previous exercise to a new file called blast_split.sub and modify the \"queue\" line of the submit file to the following: queue inputfile matching mouse_rna.fa.* Replace the mouse.fa instances in the submit file with $(inputfile) , and rename the output, log, and error files to use the same inputfile variable: output = $(inputfile).out error = $(inputfile).err log = $(inputfile).log Add an arguments line to the submit file so it will pass the name of the input file to the wrapper script arguments = $(inputfile) Add the $(inputfile) to the end of your list of transfer_input_files : transfer_input_files = ... , $(inputfile) Remove or comment out transfer_output_files and transfer_output_remaps . Update the memory and disk requests, since the new input file is larger and will also produce larger output. It may be best to overestimate to something like 1 GB for each. Modify the wrapper file \u00b6 Replace instances of the input file name in the blast_wrapper.sh script so that it will insert the first argument in place of the input filename, like so: ./blastx -db pdbaa -query $1 -out $1.result Note Bash shell scripts will use the first argument in place of $1 , the second argument as $2 , etc. Submit the jobs \u00b6 This job will take a bit longer than the job in the last exercise, since the input file is larger (by about 3-fold). Again, make sure that only the desired output , error , and result files come back at the end of the job. In our tests, the jobs ran for ~15 minutes. Jobs on jobs! Be careful to not submit the job again. Why? Our queue statement says ... matching mouse_rna.fa.* , and look at the current directory. There are new files named mouse_rna.fa.X.log and other files. Submitting again, the queue statement would see these new files, and try to run blast on them! If you want to remove all of the extra files, you can try: user@ap40 $ rm *.err *.log *.out *.result Update the resource requests \u00b6 After the job finishes successfully, examine the log file for memory and disk usage, and update the requests in the submit file.","title":"1.3- Splitting input"},{"location":"materials/data/part1-ex3-blast-split/#data-exercise-13-splitting-large-input-for-better-throughput","text":"The objective of this exercise is to prepare for blasting a much larger input query file by splitting the input for greater throughput and lower memory and disk requirements. Splitting the input will also mean that we don't have to rely on additional large-data measures for the input query files.","title":"Data Exercise 1.3: Splitting Large Input for Better Throughput"},{"location":"materials/data/part1-ex3-blast-split/#setup","text":"Log in to ap40.uw.osg-htc.org Create a directory for this exercise named blast-split and change into it. Copy over the following files from the previous exercise : Your submit file blastx pdbaa_files.tar.gz blast_wrapper.sh Remember to modify the submit file for the new locations of the above files.","title":"Setup"},{"location":"materials/data/part1-ex3-blast-split/#obtain-the-large-input","text":"We've previously used blastx to analyze a relatively small input file of test data, mouse.fa , but let's imagine that you now need to blast a much larger dataset for your research. This dataset can be downloaded with the following command: user@ap40 $ wget http://proxy.chtc.wisc.edu/SQUID/osg-school-2024/mouse_rna.tar.gz After un-tar'ing ( tar xzf mouse_rna.tar.gz ) the file, you should be able to confirm that it's size is roughly 100 MB. Not only is this near the size cutoff for HTCondor file transfer, it would take hours to complete a single blastx analysis for it and the resulting output file would be huge.","title":"Obtain the large input"},{"location":"materials/data/part1-ex3-blast-split/#split-the-input-file","text":"For blast , it's scientifically valid to split up the input query file, analyze the pieces, and then put the results back together at the end! On the other hand, BLAST databases should not be split, because the blast output includes a score value for each sequence that is calculated relative to the entire length of the database. Because genetic sequence data is used heavily across the life sciences, there are also tools for splitting up the data into smaller files. One of these is called genome tools , and you can download a package of precompiled binaries (just like BLAST) using the following command: user@ap40 $ wget http://proxy.chtc.wisc.edu/SQUID/osg-school-2024/gt-1.5.10-Linux_x86_64-64bit-complete.tar.gz Un-tar the gt package ( tar -xzvf ... ), then run its sequence file splitter as follows, with the target file size of 1MB: user@ap40 $ ./gt-1.5.10-Linux_x86_64-64bit-complete/bin/gt splitfasta -targetsize 1 mouse_rna.fa You'll notice that the result is a set of 100 files, all about the size of 1 MB, and numbered 1 through 100.","title":"Split the input file"},{"location":"materials/data/part1-ex3-blast-split/#run-a-jobs-on-split-input","text":"Now, you'll submit jobs on the split input files, where each job will use a different piece of the large original input file.","title":"Run a Jobs on Split Input"},{"location":"materials/data/part1-ex3-blast-split/#modify-the-submit-file","text":"First, you'll create a new submit file that passes the input filename as an argument and use a list of applicable filenames. Follow the below steps: Copy the submit file from the previous exercise to a new file called blast_split.sub and modify the \"queue\" line of the submit file to the following: queue inputfile matching mouse_rna.fa.* Replace the mouse.fa instances in the submit file with $(inputfile) , and rename the output, log, and error files to use the same inputfile variable: output = $(inputfile).out error = $(inputfile).err log = $(inputfile).log Add an arguments line to the submit file so it will pass the name of the input file to the wrapper script arguments = $(inputfile) Add the $(inputfile) to the end of your list of transfer_input_files : transfer_input_files = ... , $(inputfile) Remove or comment out transfer_output_files and transfer_output_remaps . Update the memory and disk requests, since the new input file is larger and will also produce larger output. It may be best to overestimate to something like 1 GB for each.","title":"Modify the submit file"},{"location":"materials/data/part1-ex3-blast-split/#modify-the-wrapper-file","text":"Replace instances of the input file name in the blast_wrapper.sh script so that it will insert the first argument in place of the input filename, like so: ./blastx -db pdbaa -query $1 -out $1.result Note Bash shell scripts will use the first argument in place of $1 , the second argument as $2 , etc.","title":"Modify the wrapper file"},{"location":"materials/data/part1-ex3-blast-split/#submit-the-jobs","text":"This job will take a bit longer than the job in the last exercise, since the input file is larger (by about 3-fold). Again, make sure that only the desired output , error , and result files come back at the end of the job. In our tests, the jobs ran for ~15 minutes. Jobs on jobs! Be careful to not submit the job again. Why? Our queue statement says ... matching mouse_rna.fa.* , and look at the current directory. There are new files named mouse_rna.fa.X.log and other files. Submitting again, the queue statement would see these new files, and try to run blast on them! If you want to remove all of the extra files, you can try: user@ap40 $ rm *.err *.log *.out *.result","title":"Submit the jobs"},{"location":"materials/data/part1-ex3-blast-split/#update-the-resource-requests","text":"After the job finishes successfully, examine the log file for memory and disk usage, and update the requests in the submit file.","title":"Update the resource requests"},{"location":"materials/data/part2-ex1-osdf-inputs/","text":"Data Exercise 2.1: Using OSDF for Large Shared Data \u00b6 This exercise will use a BLAST workflow to demonstrate the functionality of OSDF for transferring input files to jobs on OSG. Because our individual blast jobs from previous exercises would take a bit longer with a larger database (too long for an workable exercise), we'll imagine for this exercise that our pdbaa_files.tar.gz file is too large for transfer_input_files (larger than ~1 GB). For this exercise, we will use the same inputs, but instead of using transfer_input_files for the pdbaa database, we will place it in OSDF and have the jobs download from there. OSDF is connected to a distributed set of caches spread across the U.S. They are connected with high bandwidth connections to each other, and to the data origin servers, where your data is originally placed. Setup \u00b6 Make sure you're logged in to ap40.uw.osg-htc.org Copy the following files from the previous Blast exercises to a new directory in /home/ called osdf-shared : blast_wrapper.sh blastx mouse_rna.fa.1 mouse_rna.fa.2 mouse_rna.fa.3 Your most recent submit file (probably named blast_split.sub ) Place the Database in OSDF \u00b6 Copy to your data to the OSDF space \u00b6 OSDF provides a directory for you to store data which can be accessed through the caching servers. First, you need to move your BLAST database ( pdbaa_files.tar.gz ) into this directory. For ap40.uw.osg-htc.org , the directory to use is /ospool/ap40/data/[USERNAME]/ Note that files placed in the /ospool/ap40/data/[USERNAME]/ directory will only be accessible by your own jobs. Modify the Submit File and Wrapper \u00b6 You will have to modify the wrapper and submit file to use OSDF: HTCondor knows how to do OSDF transfers, so you just have to provide the correct URL in transfer_input_files . Note there is no servername (3 slashes in :///) and we instead is is just based on namespace ( /ospool/ap40 in this case): transfer_input_files = blastx, $(inputfile), osdf:///ospool/ap40/data/[USERNAME]/pdbaa_files.tar.gz Confirm that your queue statement is correct for the current directory. It should be something like: queue inputfile matching mouse_rna.fa.* And that mouse_rna.fa.* files exist in the current directory (you should have copied a few them from the previous exercise directory). Submit the Job \u00b6 Now submit and monitor the job! If your 100 jobs from the previous exercise haven't started running yet, this job will not yet start. However, after it has been running for ~2 minutes, you're safe to continue to the next exercise! Considerations \u00b6 Why did we not place all files in OSDF (for example, blastx and mouse_rna.fa.* )? What do you think will happen if you make changes to pdbaa_files.tar.gz ? Will the caches be updated automatically, or is there a possiblility that the old version of pdbaa_files.tar.gz will be served up to jobs? What is the solution to this problem? (Hint: OSDF only considers the filename when caching data) Note: Keeping OSDF 'Clean' \u00b6 Just as for any data directory, it is VERY important to remove old files from OSDF when you no longer need them, especially so that you'll have plenty of space for such files in the future. For example, you would delete ( rm ) files from /ospool/ap40/data/[USERNAME]/ on when you don't need them there anymore, but only after all jobs have finished. The next time you use OSDF after the school, remember to first check for old files that you can delete. Next exercise \u00b6 Once completed, move onto the next exercise: Using OSDF for outputs","title":"2.1 - OSDF for inputs"},{"location":"materials/data/part2-ex1-osdf-inputs/#data-exercise-21-using-osdf-for-large-shared-data","text":"This exercise will use a BLAST workflow to demonstrate the functionality of OSDF for transferring input files to jobs on OSG. Because our individual blast jobs from previous exercises would take a bit longer with a larger database (too long for an workable exercise), we'll imagine for this exercise that our pdbaa_files.tar.gz file is too large for transfer_input_files (larger than ~1 GB). For this exercise, we will use the same inputs, but instead of using transfer_input_files for the pdbaa database, we will place it in OSDF and have the jobs download from there. OSDF is connected to a distributed set of caches spread across the U.S. They are connected with high bandwidth connections to each other, and to the data origin servers, where your data is originally placed.","title":"Data Exercise 2.1: Using OSDF for Large Shared Data"},{"location":"materials/data/part2-ex1-osdf-inputs/#setup","text":"Make sure you're logged in to ap40.uw.osg-htc.org Copy the following files from the previous Blast exercises to a new directory in /home/ called osdf-shared : blast_wrapper.sh blastx mouse_rna.fa.1 mouse_rna.fa.2 mouse_rna.fa.3 Your most recent submit file (probably named blast_split.sub )","title":"Setup"},{"location":"materials/data/part2-ex1-osdf-inputs/#place-the-database-in-osdf","text":"","title":"Place the Database in OSDF"},{"location":"materials/data/part2-ex1-osdf-inputs/#copy-to-your-data-to-the-osdf-space","text":"OSDF provides a directory for you to store data which can be accessed through the caching servers. First, you need to move your BLAST database ( pdbaa_files.tar.gz ) into this directory. For ap40.uw.osg-htc.org , the directory to use is /ospool/ap40/data/[USERNAME]/ Note that files placed in the /ospool/ap40/data/[USERNAME]/ directory will only be accessible by your own jobs.","title":"Copy to your data to the OSDF space"},{"location":"materials/data/part2-ex1-osdf-inputs/#modify-the-submit-file-and-wrapper","text":"You will have to modify the wrapper and submit file to use OSDF: HTCondor knows how to do OSDF transfers, so you just have to provide the correct URL in transfer_input_files . Note there is no servername (3 slashes in :///) and we instead is is just based on namespace ( /ospool/ap40 in this case): transfer_input_files = blastx, $(inputfile), osdf:///ospool/ap40/data/[USERNAME]/pdbaa_files.tar.gz Confirm that your queue statement is correct for the current directory. It should be something like: queue inputfile matching mouse_rna.fa.* And that mouse_rna.fa.* files exist in the current directory (you should have copied a few them from the previous exercise directory).","title":"Modify the Submit File and Wrapper"},{"location":"materials/data/part2-ex1-osdf-inputs/#submit-the-job","text":"Now submit and monitor the job! If your 100 jobs from the previous exercise haven't started running yet, this job will not yet start. However, after it has been running for ~2 minutes, you're safe to continue to the next exercise!","title":"Submit the Job"},{"location":"materials/data/part2-ex1-osdf-inputs/#considerations","text":"Why did we not place all files in OSDF (for example, blastx and mouse_rna.fa.* )? What do you think will happen if you make changes to pdbaa_files.tar.gz ? Will the caches be updated automatically, or is there a possiblility that the old version of pdbaa_files.tar.gz will be served up to jobs? What is the solution to this problem? (Hint: OSDF only considers the filename when caching data)","title":"Considerations"},{"location":"materials/data/part2-ex1-osdf-inputs/#note-keeping-osdf-clean","text":"Just as for any data directory, it is VERY important to remove old files from OSDF when you no longer need them, especially so that you'll have plenty of space for such files in the future. For example, you would delete ( rm ) files from /ospool/ap40/data/[USERNAME]/ on when you don't need them there anymore, but only after all jobs have finished. The next time you use OSDF after the school, remember to first check for old files that you can delete.","title":"Note: Keeping OSDF 'Clean'"},{"location":"materials/data/part2-ex1-osdf-inputs/#next-exercise","text":"Once completed, move onto the next exercise: Using OSDF for outputs","title":"Next exercise"},{"location":"materials/data/part2-ex2-osdf-outputs/","text":"Data Exercise 2.2: Using OSDF for outputs \u00b6 In this exercise, we will run a multimedia program that converts and manipulates video files. In particular, we want to convert large .mov files to smaller (10-100s of MB) mp4 files. Just like the Blast database in the previous exercise , these video files are potentially too large to send to jobs using HTCondor's default file transfer for inputs/outputs, so we will use OSDF. Data \u00b6 To get the exercise set up: Log into ap40.uw.osg-htc.org Create a directory for this exercise named osdf-outputs and change into it. Download the input data and store it under the OSDF directory ( cd to that directory first): user@ap40 $ cd /ospool/ap40/data/ [ USERNAME ] / user@ap40 $ wget http://proxy.chtc.wisc.edu/SQUID/osg-school-2024/ducks.mov user@ap40 $ wget http://proxy.chtc.wisc.edu/SQUID/osg-school-2024/teaching.mov user@ap40 $ wget http://proxy.chtc.wisc.edu/SQUID/osg-school-2024/test_open_terminal.mov We're going to need a list of these files later. Below is the final list of movie files. cd back to your osdf-outputs directory and create a file named movie_list.txt , with the following content: ducks.mov teaching.mov test_open_terminal.mov Software \u00b6 We'll be using a multi-purpose media tool called ffmpeg to convert video formats. The basic command to convert a file looks like this: user@ap40 $ ./ffmpeg -i input.mov output.mp4 In order to resize our files, we're going to manually set the video bitrate and resize the frames, so that the resulting file is smaller. user@ap40 $ ./ffmpeg -i input.mp4 -b:v 400k -s 640x360 output.mp4 To get the ffmpeg binary do the following: We'll be downloading the ffmpeg pre-built static binary originally from this page: http://johnvansickle.com/ffmpeg/ . user@ap40 $ wget http://proxy.chtc.wisc.edu/SQUID/osg-school-2024/ffmpeg-release-64bit-static.tar.xz Once the binary is downloaded, un-tar it, and then copy the main ffmpeg program into your current directory: user@ap40 $ tar -xf ffmpeg-release-64bit-static.tar.xz user@ap40 $ cp ffmpeg-4.0.1-64bit-static/ffmpeg ./ Script \u00b6 We want to write a script that runs on the worker node that uses ffmpeg to convert a .mov file to a smaller format. Our script will need to run the proper executable. Create a file called run_ffmpeg.sh , that does the steps described above. Use the name of the smallest .mov file in the ffmpeg command. An example of that script is below: #!/bin/bash ./ffmpeg -i test_open_terminal.mov -b:v 400k -s 640x360 test_open_terminal.mp4 Ultimately we'll want to submit several jobs (one for each .mov file), but to start with, we'll run one job to make sure that everything works. Remember to chmod +x run_ffmpeg.sh to make the script executable. Submit File \u00b6 Create a submit file for this job, based on other submit files from the school. Things to consider: We'll be copying the video file into the job's working directory from OSDF, so make sure to request enough disk space for the input mov file and the output mp4 file. If you're aren't sure how much to request, ask a helper. Add the same requirements as the previous exercise: requirements = (OSGVO_OS_STRING == \"RHEL 9\") We need to transfer the ffmpeg program that we downloaded above, and the movie from OSDF: transfer_input_files = ffmpeg, osdf:///ospool/ap40/data/[USERNAME]/test_open_terminal.mov Transfer outputs via OSDF. This requires a transfer remap: transfer_output_files = test_open_terminal.mp4 transfer_output_remaps = \"test_open_terminal.mp4 = osdf:///ospool/ap40/data/[USERNAME]/test_open_terminal.mp4\" Initial Job \u00b6 With everything in place, submit the job. Once it finishes, we should check to make sure everything ran as expected: Check the OSDF directory. Did the output .mp4 file return? Check file sizes. How big is the returned .mp4 file? How does that compare to the original .mov input? If your job successfully returned the converted .mp4 file and did not transfer the .mov file to the submit server, and the .mp4 file was appropriately scaled down, then we can go ahead and convert all of the files we uploaded to OSDF. Multiple jobs \u00b6 We wrote the name of the .mov file into our run_ffmpeg.sh executable script. To submit a set of jobs for all of our .mov files, what will we need to change in: The script? The submit file? Once you've thought about it, check your reasoning against the instructions below. Add an argument to your script \u00b6 Look at your run_ffmpeg.sh script. What values will change for every job? The input file will change with every job - and don't forget that the output file will too! Let's make them both into arguments. To add arguments to a bash script, we use the notation $1 for the first argument (our input file) and $2 for the second argument (our output file name). The final script should look like this: #!/bin/bash ./ffmpeg -i $1 -b:v 400k -s 640x360 $2 Modify your submit file \u00b6 We now need to tell each job what arguments to use. We will do this by adding an arguments line to our submit file. Because we'll only have the input file name, the \"output\" file name will be the input file name with the mp4 extension. That should look like this: arguments = $(mov) $(mov).mp4 Update the transfer_input_files to have $(mov) : transfer_input_files = ffmpeg, osdf:///ospool/ap40/data/[USERNAME]/$(mov) Similarly, update the output/remap with $(mov).mp4 : transfer_output_files = $(mov).mp4 transfer_output_remaps = \"$(mov).mp4 = osdf:///ospool/ap40/data/[USERNAME]/$(mov).mp4\" To set these arguments, we will use the queue .. from syntax. In our submit file, we can then change our queue statement to: queue mov from movie_list.txt Once you've made these changes, try submitting all the jobs! Bonus \u00b6 If you wanted to set a different output file name, bitrate and/or size for each original movie, how could you modify: movie_list.txt Your submit file run_ffmpeg.sh to do so? Show hint Here's the changes you can make to the various files: movie_list.txt ducks.mov ducks.mp4 500k 1280x720 teaching.mov teaching.mp4 400k 320x180 test_open_terminal.mov terminal.mp4 600k 640x360 Submit file arguments = $(mov) $(mp4) $(bitrate) $(size) queue mov,mp4,bitrate,size from movie_list.txt run_ffmpeg.sh 1 2 #!/bin/bash ./ffmpeg -i $1 -b:v $3 -s $4 $2","title":"2.2 - OSDF for outputs"},{"location":"materials/data/part2-ex2-osdf-outputs/#data-exercise-22-using-osdf-for-outputs","text":"In this exercise, we will run a multimedia program that converts and manipulates video files. In particular, we want to convert large .mov files to smaller (10-100s of MB) mp4 files. Just like the Blast database in the previous exercise , these video files are potentially too large to send to jobs using HTCondor's default file transfer for inputs/outputs, so we will use OSDF.","title":"Data Exercise 2.2: Using OSDF for outputs"},{"location":"materials/data/part2-ex2-osdf-outputs/#data","text":"To get the exercise set up: Log into ap40.uw.osg-htc.org Create a directory for this exercise named osdf-outputs and change into it. Download the input data and store it under the OSDF directory ( cd to that directory first): user@ap40 $ cd /ospool/ap40/data/ [ USERNAME ] / user@ap40 $ wget http://proxy.chtc.wisc.edu/SQUID/osg-school-2024/ducks.mov user@ap40 $ wget http://proxy.chtc.wisc.edu/SQUID/osg-school-2024/teaching.mov user@ap40 $ wget http://proxy.chtc.wisc.edu/SQUID/osg-school-2024/test_open_terminal.mov We're going to need a list of these files later. Below is the final list of movie files. cd back to your osdf-outputs directory and create a file named movie_list.txt , with the following content: ducks.mov teaching.mov test_open_terminal.mov","title":"Data"},{"location":"materials/data/part2-ex2-osdf-outputs/#software","text":"We'll be using a multi-purpose media tool called ffmpeg to convert video formats. The basic command to convert a file looks like this: user@ap40 $ ./ffmpeg -i input.mov output.mp4 In order to resize our files, we're going to manually set the video bitrate and resize the frames, so that the resulting file is smaller. user@ap40 $ ./ffmpeg -i input.mp4 -b:v 400k -s 640x360 output.mp4 To get the ffmpeg binary do the following: We'll be downloading the ffmpeg pre-built static binary originally from this page: http://johnvansickle.com/ffmpeg/ . user@ap40 $ wget http://proxy.chtc.wisc.edu/SQUID/osg-school-2024/ffmpeg-release-64bit-static.tar.xz Once the binary is downloaded, un-tar it, and then copy the main ffmpeg program into your current directory: user@ap40 $ tar -xf ffmpeg-release-64bit-static.tar.xz user@ap40 $ cp ffmpeg-4.0.1-64bit-static/ffmpeg ./","title":"Software"},{"location":"materials/data/part2-ex2-osdf-outputs/#script","text":"We want to write a script that runs on the worker node that uses ffmpeg to convert a .mov file to a smaller format. Our script will need to run the proper executable. Create a file called run_ffmpeg.sh , that does the steps described above. Use the name of the smallest .mov file in the ffmpeg command. An example of that script is below: #!/bin/bash ./ffmpeg -i test_open_terminal.mov -b:v 400k -s 640x360 test_open_terminal.mp4 Ultimately we'll want to submit several jobs (one for each .mov file), but to start with, we'll run one job to make sure that everything works. Remember to chmod +x run_ffmpeg.sh to make the script executable.","title":"Script"},{"location":"materials/data/part2-ex2-osdf-outputs/#submit-file","text":"Create a submit file for this job, based on other submit files from the school. Things to consider: We'll be copying the video file into the job's working directory from OSDF, so make sure to request enough disk space for the input mov file and the output mp4 file. If you're aren't sure how much to request, ask a helper. Add the same requirements as the previous exercise: requirements = (OSGVO_OS_STRING == \"RHEL 9\") We need to transfer the ffmpeg program that we downloaded above, and the movie from OSDF: transfer_input_files = ffmpeg, osdf:///ospool/ap40/data/[USERNAME]/test_open_terminal.mov Transfer outputs via OSDF. This requires a transfer remap: transfer_output_files = test_open_terminal.mp4 transfer_output_remaps = \"test_open_terminal.mp4 = osdf:///ospool/ap40/data/[USERNAME]/test_open_terminal.mp4\"","title":"Submit File"},{"location":"materials/data/part2-ex2-osdf-outputs/#initial-job","text":"With everything in place, submit the job. Once it finishes, we should check to make sure everything ran as expected: Check the OSDF directory. Did the output .mp4 file return? Check file sizes. How big is the returned .mp4 file? How does that compare to the original .mov input? If your job successfully returned the converted .mp4 file and did not transfer the .mov file to the submit server, and the .mp4 file was appropriately scaled down, then we can go ahead and convert all of the files we uploaded to OSDF.","title":"Initial Job"},{"location":"materials/data/part2-ex2-osdf-outputs/#multiple-jobs","text":"We wrote the name of the .mov file into our run_ffmpeg.sh executable script. To submit a set of jobs for all of our .mov files, what will we need to change in: The script? The submit file? Once you've thought about it, check your reasoning against the instructions below.","title":"Multiple jobs"},{"location":"materials/data/part2-ex2-osdf-outputs/#add-an-argument-to-your-script","text":"Look at your run_ffmpeg.sh script. What values will change for every job? The input file will change with every job - and don't forget that the output file will too! Let's make them both into arguments. To add arguments to a bash script, we use the notation $1 for the first argument (our input file) and $2 for the second argument (our output file name). The final script should look like this: #!/bin/bash ./ffmpeg -i $1 -b:v 400k -s 640x360 $2","title":"Add an argument to your script"},{"location":"materials/data/part2-ex2-osdf-outputs/#modify-your-submit-file","text":"We now need to tell each job what arguments to use. We will do this by adding an arguments line to our submit file. Because we'll only have the input file name, the \"output\" file name will be the input file name with the mp4 extension. That should look like this: arguments = $(mov) $(mov).mp4 Update the transfer_input_files to have $(mov) : transfer_input_files = ffmpeg, osdf:///ospool/ap40/data/[USERNAME]/$(mov) Similarly, update the output/remap with $(mov).mp4 : transfer_output_files = $(mov).mp4 transfer_output_remaps = \"$(mov).mp4 = osdf:///ospool/ap40/data/[USERNAME]/$(mov).mp4\" To set these arguments, we will use the queue .. from syntax. In our submit file, we can then change our queue statement to: queue mov from movie_list.txt Once you've made these changes, try submitting all the jobs!","title":"Modify your submit file"},{"location":"materials/data/part2-ex2-osdf-outputs/#bonus","text":"If you wanted to set a different output file name, bitrate and/or size for each original movie, how could you modify: movie_list.txt Your submit file run_ffmpeg.sh to do so? Show hint Here's the changes you can make to the various files: movie_list.txt ducks.mov ducks.mp4 500k 1280x720 teaching.mov teaching.mp4 400k 320x180 test_open_terminal.mov terminal.mp4 600k 640x360 Submit file arguments = $(mov) $(mp4) $(bitrate) $(size) queue mov,mp4,bitrate,size from movie_list.txt run_ffmpeg.sh 1 2 #!/bin/bash ./ffmpeg -i $1 -b:v $3 -s $4 $2","title":"Bonus"},{"location":"materials/htcondor/part1-ex1-login/","text":"pre em { font-style: normal; background-color: yellow; } pre strong { font-style: normal; font-weight: bold; color: \\#008; } HTC Exercise 1.1: Log In and Look Around \u00b6 Background \u00b6 There are different High Throughput Computing (HTC) systems at universities, government facilities, and other institutions around the world, and they may have different user experiences. For example, some systems have dedicated resources (which means your job will be guaranteed a certain amount of resources/time to complete), while other systems have opportunistic, backfill resources (which means your job can take advantage of some resources, but those resources could be removed at any time). Other systems have a mix of dedicated and opportunistic resources. Durring the OSG School, you will practice on two different HTC systems: the \" PATh Facility \" and \" OSG's Open Science Pool (OSPool) \". This will help prepare you for working on a variety of different HTC systems. PATh Facility: The PATh Facility provides researchers with dedicated HTC resources and the ability to run larger and longer jobs . The HTC execute pool is composed of approximately 30,000 cores and 36 A100 GPUs. OSG's Open Science Pool: The OSPool provides researchers with opportunitistic resources and the ability to run many smaller and shorter jobs silmnulatinously . The OSPool is composed of approximately 60,000+ cores and dozens of different GPUs. Exercise Goal \u00b6 The goal of this first exercise is to log in to the PATh Facility access point and look around a little bit, which will take only a few minutes. If you have trouble getting SSH access to the submit server, ask the instructors right away! Gaining access is critical for all remaining exercises. Logging In \u00b6 Today, you will use a High Throughput Computing system known as the \" PATh Facility \". The PATh Facility provides users with dedicated resources and longer runtimes than OSG's Open Science Pool. You will login to the access point of the PATh Facility, which is called ap1.facility.path-cc.io using the username you previously created. To log in, use a Secure Shell (SSH) client. From a Mac or Linux computer, start the Terminal app and run the below ssh command, replacing with your username: $ ssh @ap1.facility.path-cc.io On Windows, we recommend a free client called PuTTY , but any SSH client should be fine. If you need help finding or using an SSH client, ask the instructors for help right away ! Running Commands \u00b6 In the exercises, we will show commands that you are supposed to type or copy into the command line, like this: username@ap1 $ hostname path-ap2001 Note In the first line of the example above, the username@ap1 $ part is meant to show the Linux command-line prompt. You do not type this part! Further, your actual prompt probably is a bit different, and that is expected. So in the example above, the command that you type at your own prompt is just the eight characters hostname . The second line of the example, without the prompt, shows the output of the command; you do not type this part, either. Here are a few other commands that you can try (the examples below do not show the output from each command): username@ap1 $ whoami username@ap1 $ date username@ap1 $ uname -a A suggestion for the day: try typing into the command line as many of the commands as you can. Copy-and-paste is fine, of course, but you WILL learn more if you take the time to type each command yourself. Organizing Your Workspace \u00b6 You will be doing many different exercises over the next few days, many of them on this access point. Each exercise may use many files, once finished. To avoid confusion, it may be useful to create a separate directory for each exercise. For instance, for the rest of this exercise, you may wish to create and use a directory named intro-1.1-login , or something like that. username@ap1 $ mkdir intro-1.1-login username@ap1 $ cd intro-1.1-login Showing the Version of HTCondor \u00b6 HTCondor is installed on this server. But what version? You can ask HTCondor itself: username@ap1 $ condor_version $ CondorVersion: 23 .9.0 2024 -06-27 BuildID: 742143 PackageID: 23 .9.0-0.742143 GitSHA: 68fde429 RC $ $ CondorPlatform: x86_64_AlmaLinux8 $ As you can see from the output, we are using HTCondor 10.7.0. Reference Materials \u00b6 Here are a few links to reference materials that might be interesting after the school (or perhaps during). HTCondor manuals ; it is probably best to read the manual corresponding to the version of HTCondor that you use. That link points to the latest version of the manual, but you can switch versions using the toggle in the lower left corner of that page.","title":"1.1 - Log in and look around"},{"location":"materials/htcondor/part1-ex1-login/#htc-exercise-11-log-in-and-look-around","text":"","title":"HTC Exercise 1.1: Log In and Look Around"},{"location":"materials/htcondor/part1-ex1-login/#background","text":"There are different High Throughput Computing (HTC) systems at universities, government facilities, and other institutions around the world, and they may have different user experiences. For example, some systems have dedicated resources (which means your job will be guaranteed a certain amount of resources/time to complete), while other systems have opportunistic, backfill resources (which means your job can take advantage of some resources, but those resources could be removed at any time). Other systems have a mix of dedicated and opportunistic resources. Durring the OSG School, you will practice on two different HTC systems: the \" PATh Facility \" and \" OSG's Open Science Pool (OSPool) \". This will help prepare you for working on a variety of different HTC systems. PATh Facility: The PATh Facility provides researchers with dedicated HTC resources and the ability to run larger and longer jobs . The HTC execute pool is composed of approximately 30,000 cores and 36 A100 GPUs. OSG's Open Science Pool: The OSPool provides researchers with opportunitistic resources and the ability to run many smaller and shorter jobs silmnulatinously . The OSPool is composed of approximately 60,000+ cores and dozens of different GPUs.","title":"Background"},{"location":"materials/htcondor/part1-ex1-login/#exercise-goal","text":"The goal of this first exercise is to log in to the PATh Facility access point and look around a little bit, which will take only a few minutes. If you have trouble getting SSH access to the submit server, ask the instructors right away! Gaining access is critical for all remaining exercises.","title":"Exercise Goal"},{"location":"materials/htcondor/part1-ex1-login/#logging-in","text":"Today, you will use a High Throughput Computing system known as the \" PATh Facility \". The PATh Facility provides users with dedicated resources and longer runtimes than OSG's Open Science Pool. You will login to the access point of the PATh Facility, which is called ap1.facility.path-cc.io using the username you previously created. To log in, use a Secure Shell (SSH) client. From a Mac or Linux computer, start the Terminal app and run the below ssh command, replacing with your username: $ ssh @ap1.facility.path-cc.io On Windows, we recommend a free client called PuTTY , but any SSH client should be fine. If you need help finding or using an SSH client, ask the instructors for help right away !","title":"Logging In"},{"location":"materials/htcondor/part1-ex1-login/#running-commands","text":"In the exercises, we will show commands that you are supposed to type or copy into the command line, like this: username@ap1 $ hostname path-ap2001 Note In the first line of the example above, the username@ap1 $ part is meant to show the Linux command-line prompt. You do not type this part! Further, your actual prompt probably is a bit different, and that is expected. So in the example above, the command that you type at your own prompt is just the eight characters hostname . The second line of the example, without the prompt, shows the output of the command; you do not type this part, either. Here are a few other commands that you can try (the examples below do not show the output from each command): username@ap1 $ whoami username@ap1 $ date username@ap1 $ uname -a A suggestion for the day: try typing into the command line as many of the commands as you can. Copy-and-paste is fine, of course, but you WILL learn more if you take the time to type each command yourself.","title":"Running Commands"},{"location":"materials/htcondor/part1-ex1-login/#organizing-your-workspace","text":"You will be doing many different exercises over the next few days, many of them on this access point. Each exercise may use many files, once finished. To avoid confusion, it may be useful to create a separate directory for each exercise. For instance, for the rest of this exercise, you may wish to create and use a directory named intro-1.1-login , or something like that. username@ap1 $ mkdir intro-1.1-login username@ap1 $ cd intro-1.1-login","title":"Organizing Your Workspace"},{"location":"materials/htcondor/part1-ex1-login/#showing-the-version-of-htcondor","text":"HTCondor is installed on this server. But what version? You can ask HTCondor itself: username@ap1 $ condor_version $ CondorVersion: 23 .9.0 2024 -06-27 BuildID: 742143 PackageID: 23 .9.0-0.742143 GitSHA: 68fde429 RC $ $ CondorPlatform: x86_64_AlmaLinux8 $ As you can see from the output, we are using HTCondor 10.7.0.","title":"Showing the Version of HTCondor"},{"location":"materials/htcondor/part1-ex1-login/#reference-materials","text":"Here are a few links to reference materials that might be interesting after the school (or perhaps during). HTCondor manuals ; it is probably best to read the manual corresponding to the version of HTCondor that you use. That link points to the latest version of the manual, but you can switch versions using the toggle in the lower left corner of that page.","title":"Reference Materials"},{"location":"materials/htcondor/part1-ex2-commands/","text":"pre em { font-style: normal; background-color: yellow; } pre strong { font-style: normal; font-weight: bold; color: \\#008; } HTC Exercise 1.2: Experiment With HTCondor Commands \u00b6 Exercise Goal \u00b6 The goal of this exercise is to learn about two very important HTCondor commands, condor_q and condor_status . They will be useful for monitoring your jobs and available execute point slots (respectively) throughout the week. This exercise should take only a few minutes. Viewing Slots \u00b6 As discussed in the lecture, the condor_status command is used to view the current state of slots in an HTCondor pool. At its most basic, the command is: username@ap1 $ condor_status When running this command, there is typically a lot of output printed to the screen. Looking at your terminal output, there is one line per execute point slot. TIP: You can widen your terminal window, which may help you to see all details of the output better. Here is some example output (what you see will be longer): slot1@FIU-PATH-EP.osgvo-docker-pilot-55c74f5b7c-kbs77 LINUX X86_64 Unclaimed Idle 0.000 8053 0+01:14:34 slot1@UNL-PATH-EP.osgvo-docker-pilot-9489b6b4-9rf4n LINUX X86_64 Claimed Busy 0.930 1024 0+02:42:08 slot1@WISC-PATH-EP.osgvo-docker-pilot-7b46dbdbb7-xqkkg LINUX X86_64 Claimed Busy 3.530 1024 0+02:40:24 slot1@SYRA-PATH-EP.osgvo-docker-pilot-gpu-7f6c64d459 LINUX X86_64 Owner Idle 0.300 250 7+03:22:21 This output consists of 8 columns: Col Example Meaning Name slot1@UNL-PATH-EP.osgvo-docker-pilot-9489b6b4-9rf4n Full slot name (including the hostname) OpSys LINUX Operating system Arch X86_64 Slot architecture (e.g., Intel 64 bit) State Claimed State of the slot ( Unclaimed is available, Owner is being used by the machine owner, Claimed is matched to a job) Activity Busy Is there activity on the slot? LoadAv 0.930 Load average, a measure of CPU activity on the slot Mem 1024 Memory available to the slot, in MB ActvtyTime 0+02:42:08 Amount of time spent in current activity (days + hours:minutes:seconds) At the end of the slot listing, there is a summary. Here is an example: Machines Owner Claimed Unclaimed Matched Preempting Drain X86_64/LINUX 10831 0 10194 631 0 0 6 X86_64/WINDOWS 2 2 0 0 0 0 0 Total 10833 2 10194 631 0 0 6 There is one row of summary for each machine (i.e. \"slot\") architecture/operating system combination with columns for the number of slots in each state. The final row gives a summary of slot states for the whole pool. Questions: \u00b6 When you run condor_status , how many 64-bit Linux slots are available? (Hint: Unclaimed = available.) What percent of the total slots are currently claimed by a job? (Note: there is a rapid turnover of slots, which is what allows users with new submission to have jobs start quickly.) How have these numbers changed (if at all) when you run the condor_status command again? Viewing Whole Machines, Only \u00b6 Also try out the -compact for a slightly different view of whole machines (i.e. server hostnames), without the individual slots shown. username@ap1 $ condor_status -compact How has the column information changed? Viewing Jobs \u00b6 The condor_q command lists jobs that are on this access point machine and that are running or waiting to run. The _q part of the name is meant to suggest the word \u201cqueue\u201d, or list of job sets waiting to finish. Viewing Your Own Jobs \u00b6 The default behavior of the command lists only your jobs: username@ap1 $ condor_q The main part of the output (which will be empty, because you haven't submitted jobs yet) shows one set (\"batch\") of submitted jobs per line. If you had a single job in the queue, it would look something like the below: -- Schedd: ap1.facility.path-cc.io : <128.104.100.43:9618?... @ 07/12/23 09:59:31 OWNER BATCH_NAME SUBMITTED DONE RUN IDLE TOTAL JOB_IDS alice CMD: run_ffmpeg.sh 7/12 09:58 _ _ 1 1 18801.0 This output consists of 8 (or 9) columns: Col Example Meaning OWNER alice The user ID of the user who submitted the job BATCH_NAME run_ffmpeg.sh The executable or \"jobbatchname\" specified within the submit file(s) SUBMITTED 7/12 09:58 The date and time when the job was submitted DONE _ Number of jobs in this batch that have completed RUN _ Number of jobs in this batch that are currently running IDLE 1 Number of jobs in this batch that are idle, waiting for a match HOLD _ Column will show up if there are jobs on \"hold\" because something about the submission/setup needs to be corrected by the user TOTAL 1 Total number of jobs in this batch JOB_IDS 18801.0 Job ID or range of Job IDs in this batch At the end of the job listing, there is a summary. Here is a sample: 1 jobs; 0 completed, 0 removed, 1 idle, 0 running, 0 held, 0 suspended It shows total counts of jobs in the different possible states. Questions: For the sample above, when was the job submitted? For the sample above, was the job running or not yet? How can you tell? Viewing Everyone\u2019s Jobs \u00b6 By default, the condor_q command shows your jobs only. To see everyone\u2019s jobs that are queued on the machine, add the -all option: username@ap1 $ condor_q -all How many jobs are queued in total (i.e., running or waiting to run)? How many jobs from this submit machine are running right now? Viewing Jobs without the Default \"batch\" Mode \u00b6 The condor_q output, by default, groups \"batches\" of jobs together (if they were submitted with the same submit file or \"jobbatchname\"). To see more information for EVERY job on a separate line of output, use the -nobatch option to condor_q : username@ap1 $ condor_q -all -nobatch How has the column information changed? (Below is an example of the top of the output.) -- Schedd: ap1.facility.path-cc.io : <128.104.100.43:9618?... @ 07/12/23 11:58:44 ID OWNER SUBMITTED RUN_TIME ST PRI SIZE CMD 18203.0 s16_alirezakho 7/11 09:51 0+00:00:00 I 0 0.7 pascal 18204.0 s16_alirezakho 7/11 09:51 0+00:00:00 I 0 0.7 pascal 18801.0 alice 7/12 09:58 0+00:00:00 I 0 0.0 run_ffmpeg.sh 18997.0 s16_martincum 7/12 10:59 0+00:00:32 I 0 733.0 runR.pl 1_0 run_perm.R 1 0 10 19027.5 s16_martincum 7/12 11:06 0+00:09:20 I 0 2198.0 runR.pl 1_5 run_perm.R 1 5 1000 The -nobatch output shows a line for every job and consists of 8 columns: Col Example Meaning ID 18801.0 Job ID, which is the cluster , a dot character ( . ), and the process OWNER alice The user ID of the user who submitted the job SUBMITTED 7/12 09:58 The date and time when the job was submitted RUN_TIME 0+00:00:00 Total time spent running so far (days + hours:minutes:seconds) ST I Status of job: I is Idle (waiting to run), R is Running, H is Held, etc. PRI 0 Job priority (see next lecture) SIZE 0.0 Current run-time memory usage, in MB CMD run_ffmpeg.sh The executable command (with arguments) to be run In future exercises, you'll want to switch between condor_q and condor_q -nobatch to see different types of information about YOUR jobs. Extra Information \u00b6 Both condor_status and condor_q have many command-line options, some of which significantly change their output. You will explore a few of the most useful options in future exercises, but if you want to experiment now, go ahead! There are a few ways to learn more about the commands: Use the (brief) built-in help for the commands, e.g.: condor_q -h Read the installed man(ual) pages for the commands, e.g.: man condor_q Find the command in the online manual ; note: the text online is the same as the man text, only formatted for the web","title":"1.2 - Experiment with HTCondor commands"},{"location":"materials/htcondor/part1-ex2-commands/#htc-exercise-12-experiment-with-htcondor-commands","text":"","title":"HTC Exercise 1.2: Experiment With HTCondor Commands"},{"location":"materials/htcondor/part1-ex2-commands/#exercise-goal","text":"The goal of this exercise is to learn about two very important HTCondor commands, condor_q and condor_status . They will be useful for monitoring your jobs and available execute point slots (respectively) throughout the week. This exercise should take only a few minutes.","title":"Exercise Goal"},{"location":"materials/htcondor/part1-ex2-commands/#viewing-slots","text":"As discussed in the lecture, the condor_status command is used to view the current state of slots in an HTCondor pool. At its most basic, the command is: username@ap1 $ condor_status When running this command, there is typically a lot of output printed to the screen. Looking at your terminal output, there is one line per execute point slot. TIP: You can widen your terminal window, which may help you to see all details of the output better. Here is some example output (what you see will be longer): slot1@FIU-PATH-EP.osgvo-docker-pilot-55c74f5b7c-kbs77 LINUX X86_64 Unclaimed Idle 0.000 8053 0+01:14:34 slot1@UNL-PATH-EP.osgvo-docker-pilot-9489b6b4-9rf4n LINUX X86_64 Claimed Busy 0.930 1024 0+02:42:08 slot1@WISC-PATH-EP.osgvo-docker-pilot-7b46dbdbb7-xqkkg LINUX X86_64 Claimed Busy 3.530 1024 0+02:40:24 slot1@SYRA-PATH-EP.osgvo-docker-pilot-gpu-7f6c64d459 LINUX X86_64 Owner Idle 0.300 250 7+03:22:21 This output consists of 8 columns: Col Example Meaning Name slot1@UNL-PATH-EP.osgvo-docker-pilot-9489b6b4-9rf4n Full slot name (including the hostname) OpSys LINUX Operating system Arch X86_64 Slot architecture (e.g., Intel 64 bit) State Claimed State of the slot ( Unclaimed is available, Owner is being used by the machine owner, Claimed is matched to a job) Activity Busy Is there activity on the slot? LoadAv 0.930 Load average, a measure of CPU activity on the slot Mem 1024 Memory available to the slot, in MB ActvtyTime 0+02:42:08 Amount of time spent in current activity (days + hours:minutes:seconds) At the end of the slot listing, there is a summary. Here is an example: Machines Owner Claimed Unclaimed Matched Preempting Drain X86_64/LINUX 10831 0 10194 631 0 0 6 X86_64/WINDOWS 2 2 0 0 0 0 0 Total 10833 2 10194 631 0 0 6 There is one row of summary for each machine (i.e. \"slot\") architecture/operating system combination with columns for the number of slots in each state. The final row gives a summary of slot states for the whole pool.","title":"Viewing Slots"},{"location":"materials/htcondor/part1-ex2-commands/#questions","text":"When you run condor_status , how many 64-bit Linux slots are available? (Hint: Unclaimed = available.) What percent of the total slots are currently claimed by a job? (Note: there is a rapid turnover of slots, which is what allows users with new submission to have jobs start quickly.) How have these numbers changed (if at all) when you run the condor_status command again?","title":"Questions:"},{"location":"materials/htcondor/part1-ex2-commands/#viewing-whole-machines-only","text":"Also try out the -compact for a slightly different view of whole machines (i.e. server hostnames), without the individual slots shown. username@ap1 $ condor_status -compact How has the column information changed?","title":"Viewing Whole Machines, Only"},{"location":"materials/htcondor/part1-ex2-commands/#viewing-jobs","text":"The condor_q command lists jobs that are on this access point machine and that are running or waiting to run. The _q part of the name is meant to suggest the word \u201cqueue\u201d, or list of job sets waiting to finish.","title":"Viewing Jobs"},{"location":"materials/htcondor/part1-ex2-commands/#viewing-your-own-jobs","text":"The default behavior of the command lists only your jobs: username@ap1 $ condor_q The main part of the output (which will be empty, because you haven't submitted jobs yet) shows one set (\"batch\") of submitted jobs per line. If you had a single job in the queue, it would look something like the below: -- Schedd: ap1.facility.path-cc.io : <128.104.100.43:9618?... @ 07/12/23 09:59:31 OWNER BATCH_NAME SUBMITTED DONE RUN IDLE TOTAL JOB_IDS alice CMD: run_ffmpeg.sh 7/12 09:58 _ _ 1 1 18801.0 This output consists of 8 (or 9) columns: Col Example Meaning OWNER alice The user ID of the user who submitted the job BATCH_NAME run_ffmpeg.sh The executable or \"jobbatchname\" specified within the submit file(s) SUBMITTED 7/12 09:58 The date and time when the job was submitted DONE _ Number of jobs in this batch that have completed RUN _ Number of jobs in this batch that are currently running IDLE 1 Number of jobs in this batch that are idle, waiting for a match HOLD _ Column will show up if there are jobs on \"hold\" because something about the submission/setup needs to be corrected by the user TOTAL 1 Total number of jobs in this batch JOB_IDS 18801.0 Job ID or range of Job IDs in this batch At the end of the job listing, there is a summary. Here is a sample: 1 jobs; 0 completed, 0 removed, 1 idle, 0 running, 0 held, 0 suspended It shows total counts of jobs in the different possible states. Questions: For the sample above, when was the job submitted? For the sample above, was the job running or not yet? How can you tell?","title":"Viewing Your Own Jobs"},{"location":"materials/htcondor/part1-ex2-commands/#viewing-everyones-jobs","text":"By default, the condor_q command shows your jobs only. To see everyone\u2019s jobs that are queued on the machine, add the -all option: username@ap1 $ condor_q -all How many jobs are queued in total (i.e., running or waiting to run)? How many jobs from this submit machine are running right now?","title":"Viewing Everyone\u2019s Jobs"},{"location":"materials/htcondor/part1-ex2-commands/#viewing-jobs-without-the-default-batch-mode","text":"The condor_q output, by default, groups \"batches\" of jobs together (if they were submitted with the same submit file or \"jobbatchname\"). To see more information for EVERY job on a separate line of output, use the -nobatch option to condor_q : username@ap1 $ condor_q -all -nobatch How has the column information changed? (Below is an example of the top of the output.) -- Schedd: ap1.facility.path-cc.io : <128.104.100.43:9618?... @ 07/12/23 11:58:44 ID OWNER SUBMITTED RUN_TIME ST PRI SIZE CMD 18203.0 s16_alirezakho 7/11 09:51 0+00:00:00 I 0 0.7 pascal 18204.0 s16_alirezakho 7/11 09:51 0+00:00:00 I 0 0.7 pascal 18801.0 alice 7/12 09:58 0+00:00:00 I 0 0.0 run_ffmpeg.sh 18997.0 s16_martincum 7/12 10:59 0+00:00:32 I 0 733.0 runR.pl 1_0 run_perm.R 1 0 10 19027.5 s16_martincum 7/12 11:06 0+00:09:20 I 0 2198.0 runR.pl 1_5 run_perm.R 1 5 1000 The -nobatch output shows a line for every job and consists of 8 columns: Col Example Meaning ID 18801.0 Job ID, which is the cluster , a dot character ( . ), and the process OWNER alice The user ID of the user who submitted the job SUBMITTED 7/12 09:58 The date and time when the job was submitted RUN_TIME 0+00:00:00 Total time spent running so far (days + hours:minutes:seconds) ST I Status of job: I is Idle (waiting to run), R is Running, H is Held, etc. PRI 0 Job priority (see next lecture) SIZE 0.0 Current run-time memory usage, in MB CMD run_ffmpeg.sh The executable command (with arguments) to be run In future exercises, you'll want to switch between condor_q and condor_q -nobatch to see different types of information about YOUR jobs.","title":"Viewing Jobs without the Default \"batch\" Mode"},{"location":"materials/htcondor/part1-ex2-commands/#extra-information","text":"Both condor_status and condor_q have many command-line options, some of which significantly change their output. You will explore a few of the most useful options in future exercises, but if you want to experiment now, go ahead! There are a few ways to learn more about the commands: Use the (brief) built-in help for the commands, e.g.: condor_q -h Read the installed man(ual) pages for the commands, e.g.: man condor_q Find the command in the online manual ; note: the text online is the same as the man text, only formatted for the web","title":"Extra Information"},{"location":"materials/htcondor/part1-ex3-jobs/","text":"pre em { font-style: normal; background-color: yellow; } pre strong { font-style: normal; font-weight: bold; color: \\#008; } HTC Exercise 1.3: Run Jobs! \u00b6 Exercise Goal \u00b6 The goal of this exercise is to submit jobs to HTCondor and have them run on the PATh Facility. This is a huge step in learning to use an HTC system! This exercise will take longer than the first two, short ones. If you are having any problems getting the jobs to run, please ask the instructors! It is very important that you know how to run jobs. Running Your First Job \u00b6 Nearly all of the time, when you want to run an HTCondor job, you first write an HTCondor submit file for it. In this section, you will run the same hostname command as in Exercise 1.1, but where this command will run within a job on one of the 'execute' servers on the PATh Facility's HTCondor pool. First, create an example submit file called hostname.sub using your favorite text editor (e.g., nano , vim ) and then transfer the following information to that file: executable = /bin/hostname output = hostname.out error = hostname.err log = hostname.log request_cpus = 1 request_memory = 1GB request_disk = 1GB queue Save your submit file using the name hostname.sub . Note You can name the HTCondor submit file using any filename. It's a good practice to always include the .sub extension, but it is not required. This is because the submit file is a simple text file that we are using to pass information to HTCondor. The lines of the submit file have the following meanings: Submit Command Explanation executable The name of the program to run (relative to the directory from which you submit). output The filename where HTCondor will write the standard output from your job. error The filename where HTCondor will write the standard error from your job. This particular job is not likely to have any, but it is best to include this line for every job. log The filename where HTCondor will write information about your job run. While not required, it is a really good idea to have a log file for every job. request_* Tells HTCondor how many cpus and how much memory and disk we want, which is not much, because the 'hostname' executable is very small. queue Tells HTCondor to run your job with the settings above. Note that we are not using the arguments or transfer_input_files lines that were mentioned during lecture because the hostname program is all that needs to be transferred from the access point server, and we want to run it without any additional options. Double-check your submit file, so that it matches the text above. Then, tell HTCondor to run your job: username@ap1 $ condor_submit hostname.sub Submitting job(s). 1 job(s) submitted to cluster NNNN. The actual cluster number will be shown instead of NNNN . If, instead of the text above, there are error messages, read them carefully and then try to correct your submit file or ask for help. Notice that condor_submit returns back to the shell prompt right away. It does not wait for your job to run. Instead, as soon as it has finished submitting your job into the queue, the submit command finishes. View your job in the queue \u00b6 Now, use condor_q and condor_q -nobatch to watch for your job in the queue! You may not even catch the job in the R running state, because the hostname command runs very quickly. When the job itself is finished, it will 'leave' the queue and no longer be listed in the condor_q output. After the job finishes, check for the hostname output in hostname.out , which is where job information printed to the terminal screen will be printed for the job. username@ap1 $ cat hostname.out e171.chtc.wisc.edu The hostname.err file should be empty, unless there were issues running the hostname executable after it was transferred to the slot. The hostname.log is more complex and will be the focus of a later exercise. Running a Job With Arguments \u00b6 Very often, when you run a command on the command line, it includes arguments (i.e. options) after the program name, as in the below examples: username@ap1 $ sleep 60 In an HTCondor submit file, the program (or 'executable') name goes in the executable statement and all remaining arguments go into an arguments statement. For example, if the full command is: username@ap1 $ sleep 60 Then in the submit file, we would put the location of the \"sleep\" program (you can find it with which sleep ) as the job executable , and 60 as the job arguments : executable = /bin/sleep arguments = 60 Let\u2019s try a job submission with arguments. We will use the sleep command shown above, which does nothing (i.e., puts the job to sleep) for the specified number of seconds, then exits normally. It is convenient for simulating a job that takes a while to run. Create a new submit file and save the following text in it. executable = /bin/sleep arguments = 60 output = sleep.out error = sleep.err log = sleep.log request_cpus = 1 request_memory = 1GB request_disk = 1GB queue You can save the file using any name, but as a reminder, we recommend it uses the .sub file extension. Except for changing a few filenames, this submit file is nearly identical to the last one, except for the addition of the arguments line. Submit this new job to HTCondor. Again, watch for it to run using condor_q and condor_q -nobatch ; check once every 15 seconds or so. Once the job starts running, it will take about 1 minute to run (reminder: the sleep command is telling the job to do nothing for 60 seconds), so you should be able to see it running for a bit. When the job finishes, it will disappear from the queue, but there will be no output in the output or error files, because sleep does not produce any output. Running a Script Job From the Submit Directory \u00b6 So far, we have been running programs (executables) that come with the standard Linux system. More frequently, you will want to run a program that exists within your directory or perhaps a shell script of commands that you'd like to run within a job. In this example, you will write a shell script and a submit file that runs the shell script within a job: Put the following contents into a file named test-script.sh : #!/bin/sh # START echo 'Date: ' ` date ` echo 'Host: ' ` hostname ` echo 'System: ' ` uname -spo ` echo \"Program: $0 \" echo \"Args: $* \" echo 'ls: ' ` ls ` # END Add executable permissions to the file (so that it can be run as a program): username@ap1 $ chmod +x test-script.sh Test your script from the command line: username@ap1 $ ./test-script.sh hello 42 Date: Mon Jul 1 14:03:56 CDT 2024 Host: path-ap2001 System: Linux x86_64 GNU/Linux Program: ./test-script.sh Args: hello 42 ls: hostname.err hostname.log hostname.out hostname.sub sleep.log sleep.sub test-script.sh This step is really important! If you cannot run your executable from the command-line, HTCondor probably cannot run it on another machine, either. Further, debugging problems like this one is surprisingly difficult. So, if possible, test your executable and arguments as a command at the command-line first. Write the submit file (this should be getting easier by now): executable = test-script.sh arguments = foo bar baz output = script.out error = script.err log = script.log request_cpus = 1 request_memory = 1GB request_disk = 1GB queue In this example, the executable that was named in the submit file did not start with a / , so the location of the file is relative to the submit directory itself. In other words, in this format the executable must be in the same directory as the submit file. Note Blank lines between commands and spaces around the = do not matter to HTCondor. For example, this submit file is equivalent to the one above: executable = test-script.sh arguments = foo bar baz output = script.out error = script.err log = script.log request_cpus=1 request_memory=1GB request_disk=1GB queue Use whitespace to make things clear to you , the user. Submit the job, wait for it to finish, and check the standard output file (and standard error file, which should be empty). What do you notice about the lines returned for \"Program\" and \"ls\"? Remember that only files pertaining to this job will be in the job working directory on the execute point server. You're also seeing the effects of HTCondor's need to standardize some filenames when running your job, though they are named as you expect in the submission directory (per the submit file contents). Extra Challenge \u00b6 Note There are Extra Challenges throughout the school curriculum. You may be better off coming back to these after you've completed all other exercises for your current working session. Below is a Python script that does something similar to the shell script above. Run this Python script using HTCondor. #!/usr/bin/env python3 \"\"\"Extra Challenge for OSG School Written by Tim Cartwright Submitted to CHTC by #YOUR_NAME# \"\"\" import getpass import os import platform import socket import sys import time arguments = None if len ( sys . argv ) > 1 : arguments = '\"' + ' ' . join ( sys . argv [ 1 :]) + '\"' print ( __doc__ , file = sys . stderr ) print ( 'Time :' , time . strftime ( '%Y-%m- %d ( %a ) %H:%M:%S %Z' )) print ( 'Host :' , getpass . getuser (), '@' , socket . gethostname ()) uname = platform . uname () print ( \"System :\" , uname [ 0 ], uname [ 2 ], uname [ 4 ]) print ( \"Version :\" , platform . python_version ()) print ( \"Program :\" , sys . executable ) print ( 'Script :' , os . path . abspath ( __file__ )) print ( 'Args :' , arguments )","title":"1.3 - Run jobs!"},{"location":"materials/htcondor/part1-ex3-jobs/#htc-exercise-13-run-jobs","text":"","title":"HTC Exercise 1.3: Run Jobs!"},{"location":"materials/htcondor/part1-ex3-jobs/#exercise-goal","text":"The goal of this exercise is to submit jobs to HTCondor and have them run on the PATh Facility. This is a huge step in learning to use an HTC system! This exercise will take longer than the first two, short ones. If you are having any problems getting the jobs to run, please ask the instructors! It is very important that you know how to run jobs.","title":"Exercise Goal"},{"location":"materials/htcondor/part1-ex3-jobs/#running-your-first-job","text":"Nearly all of the time, when you want to run an HTCondor job, you first write an HTCondor submit file for it. In this section, you will run the same hostname command as in Exercise 1.1, but where this command will run within a job on one of the 'execute' servers on the PATh Facility's HTCondor pool. First, create an example submit file called hostname.sub using your favorite text editor (e.g., nano , vim ) and then transfer the following information to that file: executable = /bin/hostname output = hostname.out error = hostname.err log = hostname.log request_cpus = 1 request_memory = 1GB request_disk = 1GB queue Save your submit file using the name hostname.sub . Note You can name the HTCondor submit file using any filename. It's a good practice to always include the .sub extension, but it is not required. This is because the submit file is a simple text file that we are using to pass information to HTCondor. The lines of the submit file have the following meanings: Submit Command Explanation executable The name of the program to run (relative to the directory from which you submit). output The filename where HTCondor will write the standard output from your job. error The filename where HTCondor will write the standard error from your job. This particular job is not likely to have any, but it is best to include this line for every job. log The filename where HTCondor will write information about your job run. While not required, it is a really good idea to have a log file for every job. request_* Tells HTCondor how many cpus and how much memory and disk we want, which is not much, because the 'hostname' executable is very small. queue Tells HTCondor to run your job with the settings above. Note that we are not using the arguments or transfer_input_files lines that were mentioned during lecture because the hostname program is all that needs to be transferred from the access point server, and we want to run it without any additional options. Double-check your submit file, so that it matches the text above. Then, tell HTCondor to run your job: username@ap1 $ condor_submit hostname.sub Submitting job(s). 1 job(s) submitted to cluster NNNN. The actual cluster number will be shown instead of NNNN . If, instead of the text above, there are error messages, read them carefully and then try to correct your submit file or ask for help. Notice that condor_submit returns back to the shell prompt right away. It does not wait for your job to run. Instead, as soon as it has finished submitting your job into the queue, the submit command finishes.","title":"Running Your First Job"},{"location":"materials/htcondor/part1-ex3-jobs/#view-your-job-in-the-queue","text":"Now, use condor_q and condor_q -nobatch to watch for your job in the queue! You may not even catch the job in the R running state, because the hostname command runs very quickly. When the job itself is finished, it will 'leave' the queue and no longer be listed in the condor_q output. After the job finishes, check for the hostname output in hostname.out , which is where job information printed to the terminal screen will be printed for the job. username@ap1 $ cat hostname.out e171.chtc.wisc.edu The hostname.err file should be empty, unless there were issues running the hostname executable after it was transferred to the slot. The hostname.log is more complex and will be the focus of a later exercise.","title":"View your job in the queue"},{"location":"materials/htcondor/part1-ex3-jobs/#running-a-job-with-arguments","text":"Very often, when you run a command on the command line, it includes arguments (i.e. options) after the program name, as in the below examples: username@ap1 $ sleep 60 In an HTCondor submit file, the program (or 'executable') name goes in the executable statement and all remaining arguments go into an arguments statement. For example, if the full command is: username@ap1 $ sleep 60 Then in the submit file, we would put the location of the \"sleep\" program (you can find it with which sleep ) as the job executable , and 60 as the job arguments : executable = /bin/sleep arguments = 60 Let\u2019s try a job submission with arguments. We will use the sleep command shown above, which does nothing (i.e., puts the job to sleep) for the specified number of seconds, then exits normally. It is convenient for simulating a job that takes a while to run. Create a new submit file and save the following text in it. executable = /bin/sleep arguments = 60 output = sleep.out error = sleep.err log = sleep.log request_cpus = 1 request_memory = 1GB request_disk = 1GB queue You can save the file using any name, but as a reminder, we recommend it uses the .sub file extension. Except for changing a few filenames, this submit file is nearly identical to the last one, except for the addition of the arguments line. Submit this new job to HTCondor. Again, watch for it to run using condor_q and condor_q -nobatch ; check once every 15 seconds or so. Once the job starts running, it will take about 1 minute to run (reminder: the sleep command is telling the job to do nothing for 60 seconds), so you should be able to see it running for a bit. When the job finishes, it will disappear from the queue, but there will be no output in the output or error files, because sleep does not produce any output.","title":"Running a Job With Arguments"},{"location":"materials/htcondor/part1-ex3-jobs/#running-a-script-job-from-the-submit-directory","text":"So far, we have been running programs (executables) that come with the standard Linux system. More frequently, you will want to run a program that exists within your directory or perhaps a shell script of commands that you'd like to run within a job. In this example, you will write a shell script and a submit file that runs the shell script within a job: Put the following contents into a file named test-script.sh : #!/bin/sh # START echo 'Date: ' ` date ` echo 'Host: ' ` hostname ` echo 'System: ' ` uname -spo ` echo \"Program: $0 \" echo \"Args: $* \" echo 'ls: ' ` ls ` # END Add executable permissions to the file (so that it can be run as a program): username@ap1 $ chmod +x test-script.sh Test your script from the command line: username@ap1 $ ./test-script.sh hello 42 Date: Mon Jul 1 14:03:56 CDT 2024 Host: path-ap2001 System: Linux x86_64 GNU/Linux Program: ./test-script.sh Args: hello 42 ls: hostname.err hostname.log hostname.out hostname.sub sleep.log sleep.sub test-script.sh This step is really important! If you cannot run your executable from the command-line, HTCondor probably cannot run it on another machine, either. Further, debugging problems like this one is surprisingly difficult. So, if possible, test your executable and arguments as a command at the command-line first. Write the submit file (this should be getting easier by now): executable = test-script.sh arguments = foo bar baz output = script.out error = script.err log = script.log request_cpus = 1 request_memory = 1GB request_disk = 1GB queue In this example, the executable that was named in the submit file did not start with a / , so the location of the file is relative to the submit directory itself. In other words, in this format the executable must be in the same directory as the submit file. Note Blank lines between commands and spaces around the = do not matter to HTCondor. For example, this submit file is equivalent to the one above: executable = test-script.sh arguments = foo bar baz output = script.out error = script.err log = script.log request_cpus=1 request_memory=1GB request_disk=1GB queue Use whitespace to make things clear to you , the user. Submit the job, wait for it to finish, and check the standard output file (and standard error file, which should be empty). What do you notice about the lines returned for \"Program\" and \"ls\"? Remember that only files pertaining to this job will be in the job working directory on the execute point server. You're also seeing the effects of HTCondor's need to standardize some filenames when running your job, though they are named as you expect in the submission directory (per the submit file contents).","title":"Running a Script Job From the Submit Directory"},{"location":"materials/htcondor/part1-ex3-jobs/#extra-challenge","text":"Note There are Extra Challenges throughout the school curriculum. You may be better off coming back to these after you've completed all other exercises for your current working session. Below is a Python script that does something similar to the shell script above. Run this Python script using HTCondor. #!/usr/bin/env python3 \"\"\"Extra Challenge for OSG School Written by Tim Cartwright Submitted to CHTC by #YOUR_NAME# \"\"\" import getpass import os import platform import socket import sys import time arguments = None if len ( sys . argv ) > 1 : arguments = '\"' + ' ' . join ( sys . argv [ 1 :]) + '\"' print ( __doc__ , file = sys . stderr ) print ( 'Time :' , time . strftime ( '%Y-%m- %d ( %a ) %H:%M:%S %Z' )) print ( 'Host :' , getpass . getuser (), '@' , socket . gethostname ()) uname = platform . uname () print ( \"System :\" , uname [ 0 ], uname [ 2 ], uname [ 4 ]) print ( \"Version :\" , platform . python_version ()) print ( \"Program :\" , sys . executable ) print ( 'Script :' , os . path . abspath ( __file__ )) print ( 'Args :' , arguments )","title":"Extra Challenge"},{"location":"materials/htcondor/part1-ex4-logs/","text":"pre em { font-style: normal; background-color: yellow; } pre strong { font-style: normal; font-weight: bold; color: \\#008; } HTC Exercise 1.4: Read and Interpret Log Files \u00b6 Exercise Goal \u00b6 The goal of this exercise is to learn how to understand the contents of a job's log file, which is essentially a \"history\" of the steps HTCondor took to run your job. If you suspect something has gone wrong with your job, the log is the a great place to start looking for indications of whether things might have gone wrong (in addition to the .err file). This exercise is short, but you'll want to at least read over it before moving on. Reading a Log File \u00b6 For this exercise, we can examine a log file for any previous job that you have run. The example output below is based on the sleep 60 job. A job log file is updated throughout the life of a job, usually at key events. Each event starts with a heading that indicates what happened and when. Here are all of the event headings from the sleep job log (detailed output in between headings has been omitted here): 000 (5739.000.000) 2024-07-10 10:44:20 Job submitted from host: <128.104.100.43:9618?addrs=...> 040 (5739.000.000) 2024-07-10 10:45:10 Started transferring input files 040 (5739.000.000) 2024-07-10 10:45:10 Finished transferring input files 001 (5739.000.000) 2024-07-10 10:45:11 Job executing on host: <128.104.55.42:9618?addrs=...> 006 (5739.000.000) 2024-07-10 10:45:20 Image size of job updated: 72 040 (5739.000.000) 2024-07-10 10:45:20 Started transferring output files 040 (5739.000.000) 2024-07-10 10:45:20 Finished transferring output files 006 (5739.000.000) 2024-07-10 10:46:11 Image size of job updated: 4072 005 (5739.000.000) 2024-07-10 10:46:11 Job terminated. There is a lot of extra information in those lines, but you can see: The job ID: cluster 5739, process 0 (written 000 ) The date and local time of each event A brief description of the event: submission, execution, some information updates, and termination Some events provide no information in addition to the heading. For example: 000 (5739.000.000) 2024-07-10 10:44:20 Job submitted from host: <128.104.100.43:9618?addrs=...> ... Note Each event ends with a line that contains only 3 dots: ... However, some lines have additional information to help you quickly understand where and how your job is running. For example: 001 (5739.000.000) 2024-07-10 10:45:11 Job executing on host: <128.104.55.42:9618?addrs=...> SlotName: slot1@WISC-PATH-IDPL-EP.osgvo-docker-pilot-idpl-7c6575d494-2sj5w CondorScratchDir = \"/pilot/osgvo-pilot-2q71K9/execute/dir_9316\" Cpus = 1 Disk = 174321444 GLIDEIN_ResourceName = \"WISC-PATH-IDPL-EP\" GPUs = 0 Memory = 8192 ... The SlotName is the name of the execution point slot your job was assigned to by HTCondor, and the name of the execution point resource is provided in GLIDEIN_ResourceName The CondorScratchDir is the name of the scratch directory that was created by HTCondor for your job to run inside The Cpu , GPUs , Disk , and Memory values provide the maximum amount of each resource your job has used while running Another example of is the periodic update: 006 (5739.000.000) 2024-07-10 10:45:20 Image size of job updated: 72 1 - MemoryUsage of job (MB) 72 - ResidentSetSize of job (KB) ... These updates record the amount of memory that the job is using on the execute machine. This can be helpful information, so that in future runs of the job, you can tell HTCondor how much memory you will need. The job termination event includes a lot of very useful information: 005 (5739.000.000) 2024-07-10 10:46:11 Job terminated. (1) Normal termination (return value 0) Usr 0 00:00:00, Sys 0 00:00:00 - Run Remote Usage Usr 0 00:00:00, Sys 0 00:00:00 - Run Local Usage Usr 0 00:00:00, Sys 0 00:00:00 - Total Remote Usage Usr 0 00:00:00, Sys 0 00:00:00 - Total Local Usage 0 - Run Bytes Sent By Job 27848 - Run Bytes Received By Job 0 - Total Bytes Sent By Job 27848 - Total Bytes Received By Job Partitionable Resources : Usage Request Allocated Cpus : 1 1 Disk (KB) : 40 30 4203309 Memory (MB) : 1 1 1 Job terminated of its own accord at 2024-07-10 10:46:11 with exit-code 0. ... Probably the most interesting information is: The return value or exit code ( 0 here, means the executable completed and didn't indicate any internal errors; non-zero usually means failure) The total number of bytes transferred each way, which could be useful if your network is slow The Partitionable Resources table, especially disk and memory usage, which will inform larger submissions. There are many other kinds of events, but the ones above will occur in almost every job log. Understanding When Job Log Events Are Written \u00b6 When are events written to the job log file? Let\u2019s find out. Read through the entire procedure below before starting, because some parts of the process are time sensitive. Change the sleep job submit file, so that the job sleeps for 2 minutes (= 120 seconds) Submit the updated sleep job As soon as the condor_submit command finishes, hit the return key a few times, to create some blank lines Right away, run a command to show the log file and keep showing updates as they occur: username@ap1 $ tail -f sleep.log Watch the output carefully. When do events appear in the log file? After the termination event appears, press Control-C to end the tail command and return to the shell prompt. Understanding How HTCondor Writes Files \u00b6 When HTCondor writes the output, error, and log files, does it erase the previous contents of the file or does it add new lines onto the end? Let\u2019s find out! For this exercise, we can use the hostname job from earlier. Edit the hostname submit file so that it uses new and unique filenames for output, error, and log files. Alternatively, delete any existing output, error, and log files from previous runs of the hostname job. Submit the job three separate times in a row (there are better ways to do this, which we will cover in the next lecture) Wait for all three jobs to finish Examine the output file: How many hostnames are there? Did HTCondor erase the previous contents for each job, or add new lines? Examine the log file\u2026 carefully: What happened there? Pay close attention to the times and job IDs of the events. For further clarification about how HTCondor handles these files, reach out to your mentor or one of the other school staff.","title":"1.4 - Read and interpret log files"},{"location":"materials/htcondor/part1-ex4-logs/#htc-exercise-14-read-and-interpret-log-files","text":"","title":"HTC Exercise 1.4: Read and Interpret Log Files"},{"location":"materials/htcondor/part1-ex4-logs/#exercise-goal","text":"The goal of this exercise is to learn how to understand the contents of a job's log file, which is essentially a \"history\" of the steps HTCondor took to run your job. If you suspect something has gone wrong with your job, the log is the a great place to start looking for indications of whether things might have gone wrong (in addition to the .err file). This exercise is short, but you'll want to at least read over it before moving on.","title":"Exercise Goal"},{"location":"materials/htcondor/part1-ex4-logs/#reading-a-log-file","text":"For this exercise, we can examine a log file for any previous job that you have run. The example output below is based on the sleep 60 job. A job log file is updated throughout the life of a job, usually at key events. Each event starts with a heading that indicates what happened and when. Here are all of the event headings from the sleep job log (detailed output in between headings has been omitted here): 000 (5739.000.000) 2024-07-10 10:44:20 Job submitted from host: <128.104.100.43:9618?addrs=...> 040 (5739.000.000) 2024-07-10 10:45:10 Started transferring input files 040 (5739.000.000) 2024-07-10 10:45:10 Finished transferring input files 001 (5739.000.000) 2024-07-10 10:45:11 Job executing on host: <128.104.55.42:9618?addrs=...> 006 (5739.000.000) 2024-07-10 10:45:20 Image size of job updated: 72 040 (5739.000.000) 2024-07-10 10:45:20 Started transferring output files 040 (5739.000.000) 2024-07-10 10:45:20 Finished transferring output files 006 (5739.000.000) 2024-07-10 10:46:11 Image size of job updated: 4072 005 (5739.000.000) 2024-07-10 10:46:11 Job terminated. There is a lot of extra information in those lines, but you can see: The job ID: cluster 5739, process 0 (written 000 ) The date and local time of each event A brief description of the event: submission, execution, some information updates, and termination Some events provide no information in addition to the heading. For example: 000 (5739.000.000) 2024-07-10 10:44:20 Job submitted from host: <128.104.100.43:9618?addrs=...> ... Note Each event ends with a line that contains only 3 dots: ... However, some lines have additional information to help you quickly understand where and how your job is running. For example: 001 (5739.000.000) 2024-07-10 10:45:11 Job executing on host: <128.104.55.42:9618?addrs=...> SlotName: slot1@WISC-PATH-IDPL-EP.osgvo-docker-pilot-idpl-7c6575d494-2sj5w CondorScratchDir = \"/pilot/osgvo-pilot-2q71K9/execute/dir_9316\" Cpus = 1 Disk = 174321444 GLIDEIN_ResourceName = \"WISC-PATH-IDPL-EP\" GPUs = 0 Memory = 8192 ... The SlotName is the name of the execution point slot your job was assigned to by HTCondor, and the name of the execution point resource is provided in GLIDEIN_ResourceName The CondorScratchDir is the name of the scratch directory that was created by HTCondor for your job to run inside The Cpu , GPUs , Disk , and Memory values provide the maximum amount of each resource your job has used while running Another example of is the periodic update: 006 (5739.000.000) 2024-07-10 10:45:20 Image size of job updated: 72 1 - MemoryUsage of job (MB) 72 - ResidentSetSize of job (KB) ... These updates record the amount of memory that the job is using on the execute machine. This can be helpful information, so that in future runs of the job, you can tell HTCondor how much memory you will need. The job termination event includes a lot of very useful information: 005 (5739.000.000) 2024-07-10 10:46:11 Job terminated. (1) Normal termination (return value 0) Usr 0 00:00:00, Sys 0 00:00:00 - Run Remote Usage Usr 0 00:00:00, Sys 0 00:00:00 - Run Local Usage Usr 0 00:00:00, Sys 0 00:00:00 - Total Remote Usage Usr 0 00:00:00, Sys 0 00:00:00 - Total Local Usage 0 - Run Bytes Sent By Job 27848 - Run Bytes Received By Job 0 - Total Bytes Sent By Job 27848 - Total Bytes Received By Job Partitionable Resources : Usage Request Allocated Cpus : 1 1 Disk (KB) : 40 30 4203309 Memory (MB) : 1 1 1 Job terminated of its own accord at 2024-07-10 10:46:11 with exit-code 0. ... Probably the most interesting information is: The return value or exit code ( 0 here, means the executable completed and didn't indicate any internal errors; non-zero usually means failure) The total number of bytes transferred each way, which could be useful if your network is slow The Partitionable Resources table, especially disk and memory usage, which will inform larger submissions. There are many other kinds of events, but the ones above will occur in almost every job log.","title":"Reading a Log File"},{"location":"materials/htcondor/part1-ex4-logs/#understanding-when-job-log-events-are-written","text":"When are events written to the job log file? Let\u2019s find out. Read through the entire procedure below before starting, because some parts of the process are time sensitive. Change the sleep job submit file, so that the job sleeps for 2 minutes (= 120 seconds) Submit the updated sleep job As soon as the condor_submit command finishes, hit the return key a few times, to create some blank lines Right away, run a command to show the log file and keep showing updates as they occur: username@ap1 $ tail -f sleep.log Watch the output carefully. When do events appear in the log file? After the termination event appears, press Control-C to end the tail command and return to the shell prompt.","title":"Understanding When Job Log Events Are Written"},{"location":"materials/htcondor/part1-ex4-logs/#understanding-how-htcondor-writes-files","text":"When HTCondor writes the output, error, and log files, does it erase the previous contents of the file or does it add new lines onto the end? Let\u2019s find out! For this exercise, we can use the hostname job from earlier. Edit the hostname submit file so that it uses new and unique filenames for output, error, and log files. Alternatively, delete any existing output, error, and log files from previous runs of the hostname job. Submit the job three separate times in a row (there are better ways to do this, which we will cover in the next lecture) Wait for all three jobs to finish Examine the output file: How many hostnames are there? Did HTCondor erase the previous contents for each job, or add new lines? Examine the log file\u2026 carefully: What happened there? Pay close attention to the times and job IDs of the events. For further clarification about how HTCondor handles these files, reach out to your mentor or one of the other school staff.","title":"Understanding How HTCondor Writes Files"},{"location":"materials/htcondor/part1-ex5-request/","text":"pre em { font-style: normal; background-color: yellow; } pre strong { font-style: normal; font-weight: bold; color: \\#008; } HTC Exercise 1.5: Declare Resource Needs \u00b6 The goal of this exercise is to demonstrate how to test and tune the request_X statements in a submit file for when you don't know what resources your job needs. There are three special resource request statements that you can use (optionally) in an HTCondor submit file: request_cpus for the number of CPUs your job will use. A value of \"1\" is always a great starting point, but some software can use more than \"1\" (however, most softwares will use an argument to control this number). request_memory for the maximum amount of run-time memory your job may use. request_disk for the maximum amount of disk space your job may use (including the executable and all other data that may show up during the job). HTCondor defaults to certain reasonable values for these request settings, so you do not need to use them to get small jobs to run. However, it is in YOUR best interest to always estimate resource requests before submitting any job, and to definitely tune your requests before submitting multiple jobs. In many HTCondor pools: If your job goes over the request values, it may be removed from the execute machine and held (status 'H' in the condor_q output, awaiting action on your part) without saving any partial job output files. So it is a disadvantage to not declare your resource needs or if you underestimate them. Conversely, if you overestimate them by too much, your jobs will match to fewer slots and take longer to match to a slot to begin running. Additionally, by hogging up resources that you don't need, other users may be deprived of the resources they require. In the long run, it works better for all users of the pool if you declare what you really need. But how do you know what to request? In particular, we are concerned with memory and disk here; requesting multiple CPUs and using them is covered a bit in later school materials, but true HTC splits work up into jobs that each use as few CPU cores as possible (one CPU core is always best to have the most jobs running). Determining Resource Needs Before Running Any Jobs \u00b6 Note If you are running short on time, you can skip to \"Determining Resource Needs By Running Test Jobs\", below, but try to come back and read over this part at some point. It can be very difficult to predict the memory needs of your running program without running tests. Typically, the memory size of a job changes over time, making the task even trickier. If you have knowledge ahead of time about your job\u2019s maximum memory needs, use that, or maybe a number that's just a bit higher, to ensure your job has enough memory to complete. If this is your first time running your job, you can request a fairly large amount of memory (as high as what's on your laptop or other server, if you know your program can run without crashing) for a first test job, OR you can run the program locally and \"watch\" it: Examining a Running Program on a Local Computer \u00b6 When working on a shared access point, you should not run computationally-intensive work because it can use resources needed by HTCondor to manage the queue for all uses. However, you may have access to other computers (your laptop, for example, or another server) where you can observe the memory usage of a program. The downside is that you'll have to watch a program run for essentially the entire time, to make sure you catch the maximum memory usage. For Memory: \u00b6 On Mac and Windows, for example, the \"Activity Monitor\" and \"Task Manager\" applications may be useful. On a Mac or Linux system, you can use the ps command or the top command in the Terminal to watch a running program and see (roughly) how much memory it is using. Full coverage of these tools is beyond the scope of this exercise, but here are two quick examples: Using ps : username@ap1 $ ps ux USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND alice 24342 0.0 0.0 90224 1864 ? S 13:39 0:00 sshd: alice@pts/0 alice 24343 0.0 0.0 66096 1580 pts/0 Ss 13:39 0:00 -bash alice 25864 0.0 0.0 65624 996 pts/0 R+ 13:52 0:00 ps ux alice 30052 0.0 0.0 90720 2456 ? S Jun22 0:00 sshd: alice@pts/2 alice 30053 0.0 0.0 66096 1624 pts/2 Ss+ Jun22 0:00 -bash The Resident Set Size ( RSS ) column, highlighted above, gives a rough indication of the memory usage (in KB) of each running process. If your program runs long enough, you can run this command several times and note the greatest value. Using top : username@ap1 $ top -u top - 13:55:31 up 11 days, 20:59, 5 users, load average: 0.12, 0.12, 0.09 Tasks: 198 total, 1 running, 197 sleeping, 0 stopped, 0 zombie Cpu(s): 1.2%us, 0.1%sy, 0.0%ni, 98.5%id, 0.2%wa, 0.0%hi, 0.1%si, 0.0%st Mem: 4001440k total, 3558028k used, 443412k free, 258568k buffers Swap: 4194296k total, 148k used, 4194148k free, 2960760k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 24342 alice 15 0 90224 1864 1096 S 0.0 0.0 0:00.26 sshd 24343 alice 15 0 66096 1580 1232 S 0.0 0.0 0:00.07 bash 25927 alice 15 0 12760 1196 836 R 0.0 0.0 0:00.01 top 30052 alice 16 0 90720 2456 1112 S 0.0 0.1 0:00.69 sshd 30053 alice 18 0 66096 1624 1236 S 0.0 0.0 0:00.37 bash The top command (shown here with an option to limit the output to a single user ID) also shows information about running processes, but updates periodically by itself. Type the letter q to quit the interactive display. Again, the highlighted RES column shows an approximation of memory usage. For Disk: \u00b6 Determining disk needs may be a bit easier, because you can check on the size of files that a program is using while it runs. However, it is important to count all files that HTCondor counts to get an accurate size. HTCondor counts everything in your job sandbox toward your job\u2019s disk usage: The executable itself All \"input\" files (anything else that gets transferred TO the job, even if you don't think of it as \"input\") All files created during the job (broadly defined as \"output\"), including the captured standard output and error files that you list in the submit file. All temporary files created in the sandbox, even if they get deleted by the executable before it's done. If you can run your program within a single directory on a local computer (not on the access point), you should be able to view files and their sizes with the ls and du commands. Determining Resource Needs By Running Test Jobs (BEST) \u00b6 Despite the techniques mentioned above, by far the easiest approach to measuring your job\u2019s resource needs is to run one or a small number of sample jobs and have HTCondor itself tell you about the resources used during the runs. For example, here is a strange Python script that does not do anything useful, but consumes some real resources while running: #!/usr/bin/env python3 import time import os size = 1000000 numbers = [] for i in range ( size ): numbers . append ( str ( i )) with open ( 'numbers.txt' , 'w' ) as tempfile : tempfile . write ( ' ' . join ( numbers )) time . sleep ( 60 ) Without trying to figure out what this code does or how many resources it uses, create a submit file for it, and run it once with HTCondor, starting with somewhat high memory requests (\"1GB\" for memory and disk is a good starting point, unless you think the job will use far more). When it is done, examine the log file. In particular, we care about these lines: Partitionable Resources : Usage Request Allocated Cpus : 1 1 Disk (KB) : 6739 1048576 8022934 Memory (MB) : 3 1024 1024 So, now we know that HTCondor saw that the job used 6,739 KB of disk (= about 6.5 MB) and 3 MB of memory! This is a great technique for determining the real resource needs of your job. If you think resource needs vary from run to run, submit a few sample jobs and look at all the results. You should round up your resource requests a little, just in case your job occasionally uses more resources. Setting Resource Requirements \u00b6 Once you know your job\u2019s resource requirements, it is easy to declare them in your submit file. For example, taking our results above as an example, we might slightly increase our requests above what was used, just to be safe: # rounded up from 3 MB request_memory = 4MB # rounded up from 6.5 MB request_disk = 7MB Pay close attention to units: Without explicit units, request_memory is in MB (megabytes) Without explicit units, request_disk is in KB (kilobytes) Allowable units are KB (kilobytes), MB (megabytes), GB (gigabytes), and TB (terabytes) HTCondor translates these requirements into attributes that become part of the job's requirements expression. However, do not put your CPU, memory, and disk requirements directly into the requirements expression; use the request_XXX statements instead. If you still have time in this working session, Add these requirements to your submit file for the Python script, rerun the job, and confirm in the log file that your requests were used. After changing the requirements in your submit file, did your job run successfully? If not, why? (Hint: HTCondor polls a job's resource use on a timer. How long are these jobs running for?)","title":"1.5 - Determining resource needs"},{"location":"materials/htcondor/part1-ex5-request/#htc-exercise-15-declare-resource-needs","text":"The goal of this exercise is to demonstrate how to test and tune the request_X statements in a submit file for when you don't know what resources your job needs. There are three special resource request statements that you can use (optionally) in an HTCondor submit file: request_cpus for the number of CPUs your job will use. A value of \"1\" is always a great starting point, but some software can use more than \"1\" (however, most softwares will use an argument to control this number). request_memory for the maximum amount of run-time memory your job may use. request_disk for the maximum amount of disk space your job may use (including the executable and all other data that may show up during the job). HTCondor defaults to certain reasonable values for these request settings, so you do not need to use them to get small jobs to run. However, it is in YOUR best interest to always estimate resource requests before submitting any job, and to definitely tune your requests before submitting multiple jobs. In many HTCondor pools: If your job goes over the request values, it may be removed from the execute machine and held (status 'H' in the condor_q output, awaiting action on your part) without saving any partial job output files. So it is a disadvantage to not declare your resource needs or if you underestimate them. Conversely, if you overestimate them by too much, your jobs will match to fewer slots and take longer to match to a slot to begin running. Additionally, by hogging up resources that you don't need, other users may be deprived of the resources they require. In the long run, it works better for all users of the pool if you declare what you really need. But how do you know what to request? In particular, we are concerned with memory and disk here; requesting multiple CPUs and using them is covered a bit in later school materials, but true HTC splits work up into jobs that each use as few CPU cores as possible (one CPU core is always best to have the most jobs running).","title":"HTC Exercise 1.5: Declare Resource Needs"},{"location":"materials/htcondor/part1-ex5-request/#determining-resource-needs-before-running-any-jobs","text":"Note If you are running short on time, you can skip to \"Determining Resource Needs By Running Test Jobs\", below, but try to come back and read over this part at some point. It can be very difficult to predict the memory needs of your running program without running tests. Typically, the memory size of a job changes over time, making the task even trickier. If you have knowledge ahead of time about your job\u2019s maximum memory needs, use that, or maybe a number that's just a bit higher, to ensure your job has enough memory to complete. If this is your first time running your job, you can request a fairly large amount of memory (as high as what's on your laptop or other server, if you know your program can run without crashing) for a first test job, OR you can run the program locally and \"watch\" it:","title":"Determining Resource Needs Before Running Any Jobs"},{"location":"materials/htcondor/part1-ex5-request/#examining-a-running-program-on-a-local-computer","text":"When working on a shared access point, you should not run computationally-intensive work because it can use resources needed by HTCondor to manage the queue for all uses. However, you may have access to other computers (your laptop, for example, or another server) where you can observe the memory usage of a program. The downside is that you'll have to watch a program run for essentially the entire time, to make sure you catch the maximum memory usage.","title":"Examining a Running Program on a Local Computer"},{"location":"materials/htcondor/part1-ex5-request/#for-memory","text":"On Mac and Windows, for example, the \"Activity Monitor\" and \"Task Manager\" applications may be useful. On a Mac or Linux system, you can use the ps command or the top command in the Terminal to watch a running program and see (roughly) how much memory it is using. Full coverage of these tools is beyond the scope of this exercise, but here are two quick examples: Using ps : username@ap1 $ ps ux USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND alice 24342 0.0 0.0 90224 1864 ? S 13:39 0:00 sshd: alice@pts/0 alice 24343 0.0 0.0 66096 1580 pts/0 Ss 13:39 0:00 -bash alice 25864 0.0 0.0 65624 996 pts/0 R+ 13:52 0:00 ps ux alice 30052 0.0 0.0 90720 2456 ? S Jun22 0:00 sshd: alice@pts/2 alice 30053 0.0 0.0 66096 1624 pts/2 Ss+ Jun22 0:00 -bash The Resident Set Size ( RSS ) column, highlighted above, gives a rough indication of the memory usage (in KB) of each running process. If your program runs long enough, you can run this command several times and note the greatest value. Using top : username@ap1 $ top -u top - 13:55:31 up 11 days, 20:59, 5 users, load average: 0.12, 0.12, 0.09 Tasks: 198 total, 1 running, 197 sleeping, 0 stopped, 0 zombie Cpu(s): 1.2%us, 0.1%sy, 0.0%ni, 98.5%id, 0.2%wa, 0.0%hi, 0.1%si, 0.0%st Mem: 4001440k total, 3558028k used, 443412k free, 258568k buffers Swap: 4194296k total, 148k used, 4194148k free, 2960760k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 24342 alice 15 0 90224 1864 1096 S 0.0 0.0 0:00.26 sshd 24343 alice 15 0 66096 1580 1232 S 0.0 0.0 0:00.07 bash 25927 alice 15 0 12760 1196 836 R 0.0 0.0 0:00.01 top 30052 alice 16 0 90720 2456 1112 S 0.0 0.1 0:00.69 sshd 30053 alice 18 0 66096 1624 1236 S 0.0 0.0 0:00.37 bash The top command (shown here with an option to limit the output to a single user ID) also shows information about running processes, but updates periodically by itself. Type the letter q to quit the interactive display. Again, the highlighted RES column shows an approximation of memory usage.","title":"For Memory:"},{"location":"materials/htcondor/part1-ex5-request/#for-disk","text":"Determining disk needs may be a bit easier, because you can check on the size of files that a program is using while it runs. However, it is important to count all files that HTCondor counts to get an accurate size. HTCondor counts everything in your job sandbox toward your job\u2019s disk usage: The executable itself All \"input\" files (anything else that gets transferred TO the job, even if you don't think of it as \"input\") All files created during the job (broadly defined as \"output\"), including the captured standard output and error files that you list in the submit file. All temporary files created in the sandbox, even if they get deleted by the executable before it's done. If you can run your program within a single directory on a local computer (not on the access point), you should be able to view files and their sizes with the ls and du commands.","title":"For Disk:"},{"location":"materials/htcondor/part1-ex5-request/#determining-resource-needs-by-running-test-jobs-best","text":"Despite the techniques mentioned above, by far the easiest approach to measuring your job\u2019s resource needs is to run one or a small number of sample jobs and have HTCondor itself tell you about the resources used during the runs. For example, here is a strange Python script that does not do anything useful, but consumes some real resources while running: #!/usr/bin/env python3 import time import os size = 1000000 numbers = [] for i in range ( size ): numbers . append ( str ( i )) with open ( 'numbers.txt' , 'w' ) as tempfile : tempfile . write ( ' ' . join ( numbers )) time . sleep ( 60 ) Without trying to figure out what this code does or how many resources it uses, create a submit file for it, and run it once with HTCondor, starting with somewhat high memory requests (\"1GB\" for memory and disk is a good starting point, unless you think the job will use far more). When it is done, examine the log file. In particular, we care about these lines: Partitionable Resources : Usage Request Allocated Cpus : 1 1 Disk (KB) : 6739 1048576 8022934 Memory (MB) : 3 1024 1024 So, now we know that HTCondor saw that the job used 6,739 KB of disk (= about 6.5 MB) and 3 MB of memory! This is a great technique for determining the real resource needs of your job. If you think resource needs vary from run to run, submit a few sample jobs and look at all the results. You should round up your resource requests a little, just in case your job occasionally uses more resources.","title":"Determining Resource Needs By Running Test Jobs (BEST)"},{"location":"materials/htcondor/part1-ex5-request/#setting-resource-requirements","text":"Once you know your job\u2019s resource requirements, it is easy to declare them in your submit file. For example, taking our results above as an example, we might slightly increase our requests above what was used, just to be safe: # rounded up from 3 MB request_memory = 4MB # rounded up from 6.5 MB request_disk = 7MB Pay close attention to units: Without explicit units, request_memory is in MB (megabytes) Without explicit units, request_disk is in KB (kilobytes) Allowable units are KB (kilobytes), MB (megabytes), GB (gigabytes), and TB (terabytes) HTCondor translates these requirements into attributes that become part of the job's requirements expression. However, do not put your CPU, memory, and disk requirements directly into the requirements expression; use the request_XXX statements instead. If you still have time in this working session, Add these requirements to your submit file for the Python script, rerun the job, and confirm in the log file that your requests were used. After changing the requirements in your submit file, did your job run successfully? If not, why? (Hint: HTCondor polls a job's resource use on a timer. How long are these jobs running for?)","title":"Setting Resource Requirements"},{"location":"materials/htcondor/part1-ex6-remove/","text":"pre em { font-style: normal; background-color: yellow; } pre strong { font-style: normal; font-weight: bold; color: \\#008; } HTC Exercise 1.6: Remove Jobs From the Queue \u00b6 Exercise Goal \u00b6 The goal of this exercise is to show you how to remove jobs from the queue. This is helpful if you make a mistake, do not want to wait for a job to complete, or otherwise need to fix things. For example, if some test jobs go on hold for using too much memory or disk, you may want to just remove them, edit the submit files, and then submit again. Skip this exercise and come back to it if you are short on time, or until you need to remove jobs for other exercises Note Please remember to remove any jobs from the queue that you are no longer interested in. Otherwise, the queue will start to get very long with jobs that will waste resources (and decrease your priority), or that may never run (if they're on hold, or have other issues keeping them from matching). This exercise is very short, but if you are out of time, you can come back to it later. Removing a Job or Cluster From the Queue \u00b6 To practice removing jobs from the queue, you need a job in the queue! Submit a job from an earlier exercise Determine the job ID ( cluster.process ) from the condor_submit output or from condor_q Remove the job: username@ap1 $ condor_rm Use the full job ID this time, e.g. 5759.0 . Did the job leave the queue immediately? If not, about how long did it take? So far, we have created job clusters that contain only one job process (the .0 part of the job ID). That will change soon, so it is good to know how to remove a specific job ID. However, it is possible to remove all jobs that are part of a cluster at once. Simply omit the job process (the .0 part of the job ID) in the condor_rm command: username@ap1 $ condor_rm Finally, you can include many job clusters and full job IDs in a single condor_rm command. For example: username@ap1 $ condor_rm 5768 5769 5770 .0 5771 .2 Removing All of Your Jobs \u00b6 If you really want to remove all of your jobs at once, you can do that with: username@ap1 $ condor_rm If you want to test it: (optional, though you'll likely need this in the future) Quickly submit several jobs from past exercises View the jobs in the queue with condor_q Remove them all with the above command Use condor_q to track progress In case you are wondering, you can remove only your own jobs. HTCondor administrators can remove anyone\u2019s jobs, so be nice to them.","title":"1.6 - Remove jobs from the queue"},{"location":"materials/htcondor/part1-ex6-remove/#htc-exercise-16-remove-jobs-from-the-queue","text":"","title":"HTC Exercise 1.6: Remove Jobs From the Queue"},{"location":"materials/htcondor/part1-ex6-remove/#exercise-goal","text":"The goal of this exercise is to show you how to remove jobs from the queue. This is helpful if you make a mistake, do not want to wait for a job to complete, or otherwise need to fix things. For example, if some test jobs go on hold for using too much memory or disk, you may want to just remove them, edit the submit files, and then submit again. Skip this exercise and come back to it if you are short on time, or until you need to remove jobs for other exercises Note Please remember to remove any jobs from the queue that you are no longer interested in. Otherwise, the queue will start to get very long with jobs that will waste resources (and decrease your priority), or that may never run (if they're on hold, or have other issues keeping them from matching). This exercise is very short, but if you are out of time, you can come back to it later.","title":"Exercise Goal"},{"location":"materials/htcondor/part1-ex6-remove/#removing-a-job-or-cluster-from-the-queue","text":"To practice removing jobs from the queue, you need a job in the queue! Submit a job from an earlier exercise Determine the job ID ( cluster.process ) from the condor_submit output or from condor_q Remove the job: username@ap1 $ condor_rm Use the full job ID this time, e.g. 5759.0 . Did the job leave the queue immediately? If not, about how long did it take? So far, we have created job clusters that contain only one job process (the .0 part of the job ID). That will change soon, so it is good to know how to remove a specific job ID. However, it is possible to remove all jobs that are part of a cluster at once. Simply omit the job process (the .0 part of the job ID) in the condor_rm command: username@ap1 $ condor_rm Finally, you can include many job clusters and full job IDs in a single condor_rm command. For example: username@ap1 $ condor_rm 5768 5769 5770 .0 5771 .2","title":"Removing a Job or Cluster From the Queue"},{"location":"materials/htcondor/part1-ex6-remove/#removing-all-of-your-jobs","text":"If you really want to remove all of your jobs at once, you can do that with: username@ap1 $ condor_rm If you want to test it: (optional, though you'll likely need this in the future) Quickly submit several jobs from past exercises View the jobs in the queue with condor_q Remove them all with the above command Use condor_q to track progress In case you are wondering, you can remove only your own jobs. HTCondor administrators can remove anyone\u2019s jobs, so be nice to them.","title":"Removing All of Your Jobs"},{"location":"materials/htcondor/part1-ex7-compile/","text":"pre em { font-style: normal; background-color: yellow; } pre strong { font-style: normal; font-weight: bold; color: \\#008; } HTC Bonus Exercise 1.7: Compile and Run Some C Code \u00b6 The goal of this exercise is to show that compiled code works just fine in HTCondor. It is mainly of interest to people who have their own C code to run (or C++, or really any compiled code, although Java would be handled a bit differently). Preparing a C Executable \u00b6 When preparing a C program for HTCondor, it is best to compile and link the executable statically, so that it does not depend on external libraries and their particular versions. Why is this important? When your compiled C program is sent to another machine for execution, that machine may not have the same libraries that you have on your submit machine (or wherever you compile the program). If the libraries are not available or are the wrong versions, your program may fail or, perhaps worse, silently produce the wrong results. Here is a simple C program to try using (thanks, Alain Roy): #include #include #include int main ( int argc , char ** argv ) { int sleep_time ; int input ; int failure ; if ( argc != 3 ) { printf ( \"Usage: simple \\n \" ); failure = 1 ; } else { sleep_time = atoi ( argv [ 1 ]); input = atoi ( argv [ 2 ]); printf ( \"Thinking really hard for %d seconds... \\n \" , sleep_time ); sleep ( sleep_time ); printf ( \"We calculated: %d \\n \" , input * 2 ); failure = 0 ; } return failure ; } Save that code to a file, for example, simple.c . Compile the program with static linking: username@ap1 $ gcc -static -o simple simple.c As always, test that you can run your command from the command line first. First, without arguments to make sure it fails correctly: username@ap1 $ ./simple and then with valid arguments: username@ap1 $ ./simple 5 21 Running a Compiled C Program \u00b6 Running the compiled program is no different than running any other program. Here is a submit file for the C program (call it simple.sub): executable = simple arguments = \"60 64\" output = c-program.out error = c-program.err log = c-program.log should_transfer_files = YES when_to_transfer_output = ON_EXIT request_cpus = 1 request_memory = 1GB request_disk = 1MB queue Then submit the job as usual! In summary, it is easy to work with statically linked compiled code. It is possible to handle dynamically linked compiled code, but it is trickier. We will only mention this topic briefly during the lecture on Software.","title":"Bonus Exercise 1.7 - Compile and run some C code"},{"location":"materials/htcondor/part1-ex7-compile/#htc-bonus-exercise-17-compile-and-run-some-c-code","text":"The goal of this exercise is to show that compiled code works just fine in HTCondor. It is mainly of interest to people who have their own C code to run (or C++, or really any compiled code, although Java would be handled a bit differently).","title":"HTC Bonus Exercise 1.7: Compile and Run Some C Code"},{"location":"materials/htcondor/part1-ex7-compile/#preparing-a-c-executable","text":"When preparing a C program for HTCondor, it is best to compile and link the executable statically, so that it does not depend on external libraries and their particular versions. Why is this important? When your compiled C program is sent to another machine for execution, that machine may not have the same libraries that you have on your submit machine (or wherever you compile the program). If the libraries are not available or are the wrong versions, your program may fail or, perhaps worse, silently produce the wrong results. Here is a simple C program to try using (thanks, Alain Roy): #include #include #include int main ( int argc , char ** argv ) { int sleep_time ; int input ; int failure ; if ( argc != 3 ) { printf ( \"Usage: simple \\n \" ); failure = 1 ; } else { sleep_time = atoi ( argv [ 1 ]); input = atoi ( argv [ 2 ]); printf ( \"Thinking really hard for %d seconds... \\n \" , sleep_time ); sleep ( sleep_time ); printf ( \"We calculated: %d \\n \" , input * 2 ); failure = 0 ; } return failure ; } Save that code to a file, for example, simple.c . Compile the program with static linking: username@ap1 $ gcc -static -o simple simple.c As always, test that you can run your command from the command line first. First, without arguments to make sure it fails correctly: username@ap1 $ ./simple and then with valid arguments: username@ap1 $ ./simple 5 21","title":"Preparing a C Executable"},{"location":"materials/htcondor/part1-ex7-compile/#running-a-compiled-c-program","text":"Running the compiled program is no different than running any other program. Here is a submit file for the C program (call it simple.sub): executable = simple arguments = \"60 64\" output = c-program.out error = c-program.err log = c-program.log should_transfer_files = YES when_to_transfer_output = ON_EXIT request_cpus = 1 request_memory = 1GB request_disk = 1MB queue Then submit the job as usual! In summary, it is easy to work with statically linked compiled code. It is possible to handle dynamically linked compiled code, but it is trickier. We will only mention this topic briefly during the lecture on Software.","title":"Running a Compiled C Program"},{"location":"materials/htcondor/part1-ex8-queue/","text":"pre em { font-style: normal; background-color: yellow; } pre strong { font-style: normal; font-weight: bold; color: \\#008; } Bonus HTC Exercise 1.8: Explore condor_q \u00b6 The goal of this exercise is try out some of the most common options to the condor_q command, so that you can view jobs effectively. The main part of this exercise should take just a few minutes, but if you have more time later, come back and work on the extension ideas at the end to become a condor_q expert! Selecting Jobs \u00b6 The condor_q program has many options for selecting which jobs are listed. You have already seen that the default mode is to show only your jobs in \"batch\" mode: username@ap1 $ condor_q You've seen that you can view all jobs (all users) in the submit node's queue by using the -all argument: username@ap1 $ condor_q -all And you've seen that you can view more details about queued jobs, with each separate job on a single line using the -nobatch option: username@ap1 $ condor_q -nobatch username@ap1 $ condor_q -all -nobatch Did you know you can also name one or more user IDs on the command line, in which case jobs for all of the named users are listed at once? username@ap1 $ condor_q To list just the jobs associated with a single cluster number: username@ap1 $ condor_q For example, if you want to see the jobs in cluster 5678 (i.e., 5678.0 , 5678.1 , etc.), you use condor_q 5678 . To list a specific job (i.e., cluster.process, as in 5678.0): username@ap1 $ condor_q For example, to see job ID 5678.1, you use condor_q 5678.1 . Note You can name more than one cluster, job ID, or combination thereof on the command line, in which case jobs for all of the named clusters and/or job IDs are listed. Let\u2019s get some practice using condor_q selections! Using a previous exercise, submit several sleep jobs. List all jobs in the queue \u2014 are there others besides your own? Practice using all forms of condor_q that you have learned: List just your jobs, with and without batching. List a specific cluster. List a specific job ID. Try listing several users at once. Try listing several clusters and job IDs at once. When there are a variety of jobs in the queue, try combining a username and a different user's cluster or job ID in the same command \u2014 what happens? Viewing a Job ClassAd \u00b6 You may have wondered why it is useful to be able to list a single job ID using condor_q . By itself, it may not be that useful. But, in combination with another option, it is very useful! If you add the -long option to condor_q (or its short form, -l ), it will show the complete ClassAd for each selected job, instead of the one-line summary that you have seen so far. Because job ClassAds may have 80\u201390 attributes (or more), it probably makes the most sense to show the ClassAd for a single job at a time. And you know how to show just one job! Here is what the command looks like: username@ap1 $ condor_q -long The output from this command is long and complex. Most of the attributes that HTCondor adds to a job are arcane and uninteresting for us now. But here are some examples of common, interesting attributes taken directly from condor_q output (except with some line breaks added to the Requirements attribute): MyType = \"Job\" Err = \"sleep.err\" UserLog = \"/home/cat/intro-2.1-queue/sleep.log\" Requirements = ( IsOSGSchoolSlot =?= true ) && ( TARGET.Arch == \"X86_64\" ) && ( TARGET.OpSys == \"LINUX\" ) && ( TARGET.Disk >= RequestDisk ) && ( TARGET.Memory >= RequestMemory ) && ( TARGET.HasFileTransfer ) ClusterId = 2420 WhenToTransferOutput = \"ON_EXIT\" Owner = \"cat\" CondorVersion = \"$CondorVersion: 8.5.5 May 03 2016 BuildID: 366162 $\" Out = \"sleep.out\" Cmd = \"/bin/sleep\" Arguments = \"120\" Note Attributes are listed in no particular order and may change from time to time. Do not assume anything about the order of attributes in condor_q output. See what you can find in a job ClassAd from your own job. Using a previous exercise, submit a sleep job that sleeps for at least 3 minutes (180 seconds). Before the job executes, capture its ClassAd and save to a file: condor_q -l > classad-1.txt After the job starts execution but before it finishes, capture its ClassAd again and save to a file condor_q -l > classad-2.txt Now examine each saved ClassAd file. Here are a few things to look for: Can you find attributes that came from your submit file? (E.g., Cmd, Arguments, Out, Err, UserLog, and so forth) Can you find attributes that could have come from your submit file, but that HTCondor added for you? (E.g., Requirements) How many of the following attributes can you guess the meaning of? DiskUsage ImageSize BytesSent JobStatus Why Is My Job Not Running? \u00b6 Sometimes, you submit a job and it just sits in the queue in Idle state, never running. It can be difficult to figure out why a job never matches and runs. Fortunately, HTCondor can give you some help. To ask HTCondor why your job is not running, add the -better-analyze option to condor_q for the specific job. For example, for job ID 2423.0, the command is: username@ap1 $ condor_q -better-analyze 2423 .0 Of course, replace the job ID with your own. Let\u2019s submit a job that will never run and see what happens. Here is the submit file to use: executable = /bin/hostname output = norun.out error = norun.err log = norun.log should_transfer_files = YES when_to_transfer_output = ON_EXIT request_disk = 10MB request_memory = 8TB queue (Do you see what I did?) Save and submit this file. Run condor_q -better-analyze on the job ID. There is a lot of output, but a few items are worth highlighting. Here is a sample from my own job (with some lines omitted): -- Schedd: ap1.facility.path-cc.io : <128.105.68.66:9618?... ... Job 98096.000 defines the following attributes: RequestDisk = 10240 RequestMemory = 8388608 The Requirements expression for job 98096.000 reduces to these conditions: Slots Step Matched Condition ----- -------- --------- [1] 11227 Target.OpSysMajorVer == 7 [9] 13098 TARGET.Disk >= RequestDisk [11] 0 TARGET.Memory >= RequestMemory No successful match recorded. Last failed match: Fri Jul 12 15:36:30 2019 Reason for last match failure: no match found 98096.000: Run analysis summary ignoring user priority. Of 710 machines, 710 are rejected by your job's requirements 0 reject your job because of their own requirements 0 match and are already running your jobs 0 match but are serving other users 0 are able to run your job ... At the end of the summary, condor_q provides a breakdown of how machines and their own requirements match against my own job's requirements. 710 total machines were considered above, and all of them were rejected based on my job's requirements . In other words, I am asking for something that is not available. But what? Further up in the output, there is an analysis of the job's requirements, along with how many slots within the pool match each of those requirements. The example above reports that 13098 slots match our small disk request request, but none of the slots matched the TARGET.Memory >= RequestMemory condition. The output also reports the value used for the RequestMemory attribute: my job asked for 8 terabytes of memory (8,388,608 MB) -- of course no machines matched that part of the expression! That's a lot of memory on today's machines. The output from condor_q -analyze (and condor_q -better-analyze ) may be helpful or it may not be, depending on your exact case. The example above was constructed so that it would be obvious what the problem was. But in many cases, this is a good place to start looking if you are having problems matching. Bonus: Automatic Formatting Output \u00b6 Do this exercise only if you have time, though it's pretty awesome! There is a way to select the specific job attributes you want condor_q to tell you about with the -autoformat or -af option. In this case, HTCondor decides for you how to format the data you ask for from job ClassAd(s). (To tell HTCondor how to specially format this information, yourself, you could use the -format option, which we're not covering.) To use autoformatting, use the -af option followed by the attribute name, for each attribute that you want to output: username@ap1 $ condor_q -all -af Owner ClusterId Cmd moate 2418 /share/test.sh cat 2421 /bin/sleep cat 2422 /bin/sleep Bonus Question : If you wanted to print out the Requirements expression of a job, how would you do that with -af ? Is the output what you expected? (HINT: for ClassAd attributes like \"Requirements\" that are long expressions, instead of plain values, you can use -af:r to view the expressions, instead of what it's current evaluation.) References \u00b6 As suggested above, if you want to learn more about condor_q , you can do some reading: Read the condor_q man page or HTCondor Manual section (same text) to learn about more options Read about ClassAd attributes in the HTCondor Manual","title":"Bonus Exercise 1.8 - Explore condor_q"},{"location":"materials/htcondor/part1-ex8-queue/#bonus-htc-exercise-18-explore-condor_q","text":"The goal of this exercise is try out some of the most common options to the condor_q command, so that you can view jobs effectively. The main part of this exercise should take just a few minutes, but if you have more time later, come back and work on the extension ideas at the end to become a condor_q expert!","title":"Bonus HTC Exercise 1.8: Explore condor_q"},{"location":"materials/htcondor/part1-ex8-queue/#selecting-jobs","text":"The condor_q program has many options for selecting which jobs are listed. You have already seen that the default mode is to show only your jobs in \"batch\" mode: username@ap1 $ condor_q You've seen that you can view all jobs (all users) in the submit node's queue by using the -all argument: username@ap1 $ condor_q -all And you've seen that you can view more details about queued jobs, with each separate job on a single line using the -nobatch option: username@ap1 $ condor_q -nobatch username@ap1 $ condor_q -all -nobatch Did you know you can also name one or more user IDs on the command line, in which case jobs for all of the named users are listed at once? username@ap1 $ condor_q To list just the jobs associated with a single cluster number: username@ap1 $ condor_q For example, if you want to see the jobs in cluster 5678 (i.e., 5678.0 , 5678.1 , etc.), you use condor_q 5678 . To list a specific job (i.e., cluster.process, as in 5678.0): username@ap1 $ condor_q For example, to see job ID 5678.1, you use condor_q 5678.1 . Note You can name more than one cluster, job ID, or combination thereof on the command line, in which case jobs for all of the named clusters and/or job IDs are listed. Let\u2019s get some practice using condor_q selections! Using a previous exercise, submit several sleep jobs. List all jobs in the queue \u2014 are there others besides your own? Practice using all forms of condor_q that you have learned: List just your jobs, with and without batching. List a specific cluster. List a specific job ID. Try listing several users at once. Try listing several clusters and job IDs at once. When there are a variety of jobs in the queue, try combining a username and a different user's cluster or job ID in the same command \u2014 what happens?","title":"Selecting Jobs"},{"location":"materials/htcondor/part1-ex8-queue/#viewing-a-job-classad","text":"You may have wondered why it is useful to be able to list a single job ID using condor_q . By itself, it may not be that useful. But, in combination with another option, it is very useful! If you add the -long option to condor_q (or its short form, -l ), it will show the complete ClassAd for each selected job, instead of the one-line summary that you have seen so far. Because job ClassAds may have 80\u201390 attributes (or more), it probably makes the most sense to show the ClassAd for a single job at a time. And you know how to show just one job! Here is what the command looks like: username@ap1 $ condor_q -long The output from this command is long and complex. Most of the attributes that HTCondor adds to a job are arcane and uninteresting for us now. But here are some examples of common, interesting attributes taken directly from condor_q output (except with some line breaks added to the Requirements attribute): MyType = \"Job\" Err = \"sleep.err\" UserLog = \"/home/cat/intro-2.1-queue/sleep.log\" Requirements = ( IsOSGSchoolSlot =?= true ) && ( TARGET.Arch == \"X86_64\" ) && ( TARGET.OpSys == \"LINUX\" ) && ( TARGET.Disk >= RequestDisk ) && ( TARGET.Memory >= RequestMemory ) && ( TARGET.HasFileTransfer ) ClusterId = 2420 WhenToTransferOutput = \"ON_EXIT\" Owner = \"cat\" CondorVersion = \"$CondorVersion: 8.5.5 May 03 2016 BuildID: 366162 $\" Out = \"sleep.out\" Cmd = \"/bin/sleep\" Arguments = \"120\" Note Attributes are listed in no particular order and may change from time to time. Do not assume anything about the order of attributes in condor_q output. See what you can find in a job ClassAd from your own job. Using a previous exercise, submit a sleep job that sleeps for at least 3 minutes (180 seconds). Before the job executes, capture its ClassAd and save to a file: condor_q -l > classad-1.txt After the job starts execution but before it finishes, capture its ClassAd again and save to a file condor_q -l > classad-2.txt Now examine each saved ClassAd file. Here are a few things to look for: Can you find attributes that came from your submit file? (E.g., Cmd, Arguments, Out, Err, UserLog, and so forth) Can you find attributes that could have come from your submit file, but that HTCondor added for you? (E.g., Requirements) How many of the following attributes can you guess the meaning of? DiskUsage ImageSize BytesSent JobStatus","title":"Viewing a Job ClassAd"},{"location":"materials/htcondor/part1-ex8-queue/#why-is-my-job-not-running","text":"Sometimes, you submit a job and it just sits in the queue in Idle state, never running. It can be difficult to figure out why a job never matches and runs. Fortunately, HTCondor can give you some help. To ask HTCondor why your job is not running, add the -better-analyze option to condor_q for the specific job. For example, for job ID 2423.0, the command is: username@ap1 $ condor_q -better-analyze 2423 .0 Of course, replace the job ID with your own. Let\u2019s submit a job that will never run and see what happens. Here is the submit file to use: executable = /bin/hostname output = norun.out error = norun.err log = norun.log should_transfer_files = YES when_to_transfer_output = ON_EXIT request_disk = 10MB request_memory = 8TB queue (Do you see what I did?) Save and submit this file. Run condor_q -better-analyze on the job ID. There is a lot of output, but a few items are worth highlighting. Here is a sample from my own job (with some lines omitted): -- Schedd: ap1.facility.path-cc.io : <128.105.68.66:9618?... ... Job 98096.000 defines the following attributes: RequestDisk = 10240 RequestMemory = 8388608 The Requirements expression for job 98096.000 reduces to these conditions: Slots Step Matched Condition ----- -------- --------- [1] 11227 Target.OpSysMajorVer == 7 [9] 13098 TARGET.Disk >= RequestDisk [11] 0 TARGET.Memory >= RequestMemory No successful match recorded. Last failed match: Fri Jul 12 15:36:30 2019 Reason for last match failure: no match found 98096.000: Run analysis summary ignoring user priority. Of 710 machines, 710 are rejected by your job's requirements 0 reject your job because of their own requirements 0 match and are already running your jobs 0 match but are serving other users 0 are able to run your job ... At the end of the summary, condor_q provides a breakdown of how machines and their own requirements match against my own job's requirements. 710 total machines were considered above, and all of them were rejected based on my job's requirements . In other words, I am asking for something that is not available. But what? Further up in the output, there is an analysis of the job's requirements, along with how many slots within the pool match each of those requirements. The example above reports that 13098 slots match our small disk request request, but none of the slots matched the TARGET.Memory >= RequestMemory condition. The output also reports the value used for the RequestMemory attribute: my job asked for 8 terabytes of memory (8,388,608 MB) -- of course no machines matched that part of the expression! That's a lot of memory on today's machines. The output from condor_q -analyze (and condor_q -better-analyze ) may be helpful or it may not be, depending on your exact case. The example above was constructed so that it would be obvious what the problem was. But in many cases, this is a good place to start looking if you are having problems matching.","title":"Why Is My Job Not Running?"},{"location":"materials/htcondor/part1-ex8-queue/#bonus-automatic-formatting-output","text":"Do this exercise only if you have time, though it's pretty awesome! There is a way to select the specific job attributes you want condor_q to tell you about with the -autoformat or -af option. In this case, HTCondor decides for you how to format the data you ask for from job ClassAd(s). (To tell HTCondor how to specially format this information, yourself, you could use the -format option, which we're not covering.) To use autoformatting, use the -af option followed by the attribute name, for each attribute that you want to output: username@ap1 $ condor_q -all -af Owner ClusterId Cmd moate 2418 /share/test.sh cat 2421 /bin/sleep cat 2422 /bin/sleep Bonus Question : If you wanted to print out the Requirements expression of a job, how would you do that with -af ? Is the output what you expected? (HINT: for ClassAd attributes like \"Requirements\" that are long expressions, instead of plain values, you can use -af:r to view the expressions, instead of what it's current evaluation.)","title":"Bonus: Automatic Formatting Output"},{"location":"materials/htcondor/part1-ex8-queue/#references","text":"As suggested above, if you want to learn more about condor_q , you can do some reading: Read the condor_q man page or HTCondor Manual section (same text) to learn about more options Read about ClassAd attributes in the HTCondor Manual","title":"References"},{"location":"materials/htcondor/part1-ex9-status/","text":"pre em { font-style: normal; background-color: yellow; } pre strong { font-style: normal; font-weight: bold; color: \\#008; } Bonus HTC Exercise 1.9: Explore condor_status \u00b6 The goal of this exercise is try out some of the most common options to the condor_status command, so that you can view slots effectively. The main part of this exercise should take just a few minutes, but if you have more time later, come back and work on the extension ideas at the end to become a condor_status expert! Selecting Slots \u00b6 The condor_status program has many options for selecting which slots are listed. You've already learned the basic condor_status and the condor_status -compact variation (which you may wish to retry now, before proceeding). Another convenient option is to list only those slots that are available now: username@ap1 $ condor_status -avail Of course, the individual execute machines only report their slots to the collector at certain time intervals, so this list will not reflect the up-to-the-second reality of all slots. But this limitation is true of all condor_status output, not just with the -avail option. Similar to condor_q , you can limit the slots that are listed in two easy ways. To list just the slots on a specific machine: username@ap1 $ condor_status For example, if you want to see the slots on e2337.chtc.wisc.edu (in the CHTC pool): username@ap1 $ condor_status e2337.chtc.wisc.edu To list a specific slot on a machine: username@ap1 $ condor_status @ For example, to see the \u201cfirst\u201d slot on the machine above: username@ap1 $ condor_status slot1@e2337.chtc.wisc.edu Note You can name more than one hostname, slot, or combination thereof on the command line, in which case slots for all of the named hostnames and/or slots are listed. Let\u2019s get some practice using condor_status selections! List all slots in the pool \u2014 how many are there total? Practice using all forms of condor_status that you have learned: List the available slots. List the slots on a specific machine (e.g., e2337.chtc.wisc.edu ). List a specific slot from that machine. Try listing the slots from a few (but not all) machines at once. Try using a mix of hostnames and slot IDs at once. Viewing a Slot ClassAd \u00b6 Just as with condor_q , you can use condor_status to view the complete ClassAd for a given slot (often confusingly called the \u201cmachine\u201d ad): username@ap1 $ condor_status -long @ Because slot ClassAds may have 150\u2013200 attributes (or more), it probably makes the most sense to show the ClassAd for a single slot at a time, as shown above. Here are some examples of common, interesting attributes taken directly from condor_status output: OpSys = \"LINUX\" DetectedCpus = 24 OpSysAndVer = \"SL6\" MyType = \"Machine\" LoadAvg = 0.99 TotalDisk = 798098404 OSIssue = \"Scientific Linux release 6.6 (Carbon)\" TotalMemory = 24016 Machine = \"e242.chtc.wisc.edu\" CondorVersion = \"$CondorVersion: 8.5.5 May 03 2016 BuildID: 366162 $\" Memory = 1024 As you may be able to tell, there is a mix of attributes about the machine as a whole (hence the name \u201cmachine ad\u201d) and about the slot in particular. Go ahead and examine a machine ClassAd now. Viewing Slots by ClassAd Expression \u00b6 Often, it is helpful to view slots that meet some particular criteria. For example, if you know that your job needs a lot of memory to run, you may want to see how many high-memory slots there are and whether they are busy. You can filter the list of slots like this using the -constraint option and a ClassAd expression. For example, suppose we want to list all slots that are running Scientific Linux 7 (operating system) and have at least 16 GB memory available. Note that memory is reported in units of Megabytes. The command is: username@ap1 $ condor_status -constraint 'OpSysAndVer == \"CentOS7\" && Memory >= 16000' Note Be very careful with using quote characters appropriately in these commands. In the example above, the single quotes ( ' ) are for the shell, so that the entire expression is passed to condor_status untouched, and the double quotes ( \" ) surround a string value within the expression itself. Currently on PATh, there are only a few slots that meet these criteria (our high-memory servers, mainly used for metagenomics assemblies). If you are interested in learning more about writing ClassAd expressions, look at section 4.1 and especially 4.1.4 of the HTCondor Manual. This is definitely advanced material, so if you do not want to read it, that is fine. But if you do, take some time to practice writing expressions for the condor_status -constraint command. Note The condor_q command accepts the -constraint option as well! As you might expect, the option allows you to limit the jobs that are listed based on a ClassAd expression. Bonus: Formatting Output \u00b6 The condor_status command accepts the same -autoformat ( -af ) options that condor_q accepts, and the options have the same meanings in both commands. Of course, the attributes available in machine ads may differ from the ones that are available in job ads. Use the HTCondor Manual or look at individual slot ClassAds to get a better idea of what attributes are available. For example, I was curious about the host name and operating system of the slots with more than 32GB of memory: username@ap1 $ condor_status -af Machine -af OpSysAndVer -constraint 'Memory >= 32000' If you like, spend a few minutes now or later experimenting with condor_status formatting. References \u00b6 As suggested above, if you want to learn more about condor_q , you can do some reading: Read the condor_status man page or HTCondor Manual section (same text) to learn about more options Read about ClassAd attributes in the appendix of the HTCondor Manual Read about ClassAd expressions in section 4.1.4 of the HTCondor Manual","title":"Bonus Exercise 1.9- Explore condor_stataus"},{"location":"materials/htcondor/part1-ex9-status/#bonus-htc-exercise-19-explore-condor_status","text":"The goal of this exercise is try out some of the most common options to the condor_status command, so that you can view slots effectively. The main part of this exercise should take just a few minutes, but if you have more time later, come back and work on the extension ideas at the end to become a condor_status expert!","title":"Bonus HTC Exercise 1.9: Explore condor_status"},{"location":"materials/htcondor/part1-ex9-status/#selecting-slots","text":"The condor_status program has many options for selecting which slots are listed. You've already learned the basic condor_status and the condor_status -compact variation (which you may wish to retry now, before proceeding). Another convenient option is to list only those slots that are available now: username@ap1 $ condor_status -avail Of course, the individual execute machines only report their slots to the collector at certain time intervals, so this list will not reflect the up-to-the-second reality of all slots. But this limitation is true of all condor_status output, not just with the -avail option. Similar to condor_q , you can limit the slots that are listed in two easy ways. To list just the slots on a specific machine: username@ap1 $ condor_status For example, if you want to see the slots on e2337.chtc.wisc.edu (in the CHTC pool): username@ap1 $ condor_status e2337.chtc.wisc.edu To list a specific slot on a machine: username@ap1 $ condor_status @ For example, to see the \u201cfirst\u201d slot on the machine above: username@ap1 $ condor_status slot1@e2337.chtc.wisc.edu Note You can name more than one hostname, slot, or combination thereof on the command line, in which case slots for all of the named hostnames and/or slots are listed. Let\u2019s get some practice using condor_status selections! List all slots in the pool \u2014 how many are there total? Practice using all forms of condor_status that you have learned: List the available slots. List the slots on a specific machine (e.g., e2337.chtc.wisc.edu ). List a specific slot from that machine. Try listing the slots from a few (but not all) machines at once. Try using a mix of hostnames and slot IDs at once.","title":"Selecting Slots"},{"location":"materials/htcondor/part1-ex9-status/#viewing-a-slot-classad","text":"Just as with condor_q , you can use condor_status to view the complete ClassAd for a given slot (often confusingly called the \u201cmachine\u201d ad): username@ap1 $ condor_status -long @ Because slot ClassAds may have 150\u2013200 attributes (or more), it probably makes the most sense to show the ClassAd for a single slot at a time, as shown above. Here are some examples of common, interesting attributes taken directly from condor_status output: OpSys = \"LINUX\" DetectedCpus = 24 OpSysAndVer = \"SL6\" MyType = \"Machine\" LoadAvg = 0.99 TotalDisk = 798098404 OSIssue = \"Scientific Linux release 6.6 (Carbon)\" TotalMemory = 24016 Machine = \"e242.chtc.wisc.edu\" CondorVersion = \"$CondorVersion: 8.5.5 May 03 2016 BuildID: 366162 $\" Memory = 1024 As you may be able to tell, there is a mix of attributes about the machine as a whole (hence the name \u201cmachine ad\u201d) and about the slot in particular. Go ahead and examine a machine ClassAd now.","title":"Viewing a Slot ClassAd"},{"location":"materials/htcondor/part1-ex9-status/#viewing-slots-by-classad-expression","text":"Often, it is helpful to view slots that meet some particular criteria. For example, if you know that your job needs a lot of memory to run, you may want to see how many high-memory slots there are and whether they are busy. You can filter the list of slots like this using the -constraint option and a ClassAd expression. For example, suppose we want to list all slots that are running Scientific Linux 7 (operating system) and have at least 16 GB memory available. Note that memory is reported in units of Megabytes. The command is: username@ap1 $ condor_status -constraint 'OpSysAndVer == \"CentOS7\" && Memory >= 16000' Note Be very careful with using quote characters appropriately in these commands. In the example above, the single quotes ( ' ) are for the shell, so that the entire expression is passed to condor_status untouched, and the double quotes ( \" ) surround a string value within the expression itself. Currently on PATh, there are only a few slots that meet these criteria (our high-memory servers, mainly used for metagenomics assemblies). If you are interested in learning more about writing ClassAd expressions, look at section 4.1 and especially 4.1.4 of the HTCondor Manual. This is definitely advanced material, so if you do not want to read it, that is fine. But if you do, take some time to practice writing expressions for the condor_status -constraint command. Note The condor_q command accepts the -constraint option as well! As you might expect, the option allows you to limit the jobs that are listed based on a ClassAd expression.","title":"Viewing Slots by ClassAd Expression"},{"location":"materials/htcondor/part1-ex9-status/#bonus-formatting-output","text":"The condor_status command accepts the same -autoformat ( -af ) options that condor_q accepts, and the options have the same meanings in both commands. Of course, the attributes available in machine ads may differ from the ones that are available in job ads. Use the HTCondor Manual or look at individual slot ClassAds to get a better idea of what attributes are available. For example, I was curious about the host name and operating system of the slots with more than 32GB of memory: username@ap1 $ condor_status -af Machine -af OpSysAndVer -constraint 'Memory >= 32000' If you like, spend a few minutes now or later experimenting with condor_status formatting.","title":"Bonus: Formatting Output"},{"location":"materials/htcondor/part1-ex9-status/#references","text":"As suggested above, if you want to learn more about condor_q , you can do some reading: Read the condor_status man page or HTCondor Manual section (same text) to learn about more options Read about ClassAd attributes in the appendix of the HTCondor Manual Read about ClassAd expressions in section 4.1.4 of the HTCondor Manual","title":"References"},{"location":"materials/htcondor/part2-ex1-files/","text":"pre em { font-style: normal; background-color: yellow; } pre strong { font-style: normal; font-weight: bold; color: \\#008; } HTC Exercise 2.1: Work With Input and Output Files \u00b6 Exercise Goal \u00b6 The goal of this exercise is make input files available to your job on the execute machine and to return output files back created in your job back to you on the access point. This small change significantly adds to the kinds of jobs that you can run. Viewing a Job Sandbox \u00b6 Before you learn to transfer files to and from your job, it is good to understand a bit more about the environment in which your job runs. When the HTCondor starter process prepares to run your job, it creates a new directory for your job and all of its files. We call this directory the job sandbox , because it is your job\u2019s private space to play. Let\u2019s see what is in the job sandbox for a minimal job with no special input or output files. Save the script below in a file named sandbox.sh : #!/bin/sh echo 'Date: ' ` date ` echo 'Host: ' ` hostname ` echo 'Sandbox: ' ` pwd ` ls -alF # END Create a submit file for this script and submit it. When the job finishes, look at the contents of the output file. In the output file, note the Sandbox: line: That is the full path to your job sandbox for the run. It was created just for your job, and it was removed as soon as your job finished. Next, look at the output that appears after the Sandbox: line; it is the output from the ls command in the script. It shows all of the files in your job sandbox, as they existed at the end of the execution of sandbox.sh . The number of files that you see can change depending on the HTC system you are using, but some of the files you should always see are: .chirp.config Configuration for an advanced feature sandbox.sh Your executable .job.ad The job ClassAd .machine.ad The machine ClassAd _condor_stderr Saved standard error from the job _condor_stdout Saved standard output from the job tmp/ , var/tmp/ Directories in which to put temporary files So, HTCondor wrote copies of the job and machine ads (for use by the job, if desired), transferred your executable ( sandbox.sh ), ran it, and saved its standard output and standard error into files. Notice that your submit file, which was in the same directory on the access point machine as your executable, was not transferred, nor were any other files that happened to be in directory with the submit file. Now that we know something about the sandbox, we can transfer more files to and from it. Running a Job With Input Files \u00b6 Next, you will run a job that requires an input file. Remember, the initial job sandbox will contain only the job executable, unless you tell HTCondor explicitly about every other file that needs to be transferred to the job. Here is a Python script that takes the name of an input file (containing one word per line) from the command line, counts the number of times each (lowercased) word occurs in the text, and prints out the final list of words and their counts. #!/usr/bin/env python3 import os import sys if len ( sys . argv ) != 2 : print ( f 'Usage: { os . path . basename ( sys . argv [ 0 ]) } DATA' ) sys . exit ( 1 ) input_filename = sys . argv [ 1 ] words = {} with open ( input_filename , 'r' , encoding = 'iso-8859-1' ) as my_file : for line in my_file : word = line . strip () . lower () if word in words : words [ word ] += 1 else : words [ word ] = 1 for word in sorted ( words . keys ()): print ( f ' { words [ word ] : 8d } { word } ' ) Create and save the Python script in a file named freq.py . Download the input file for the script (263K lines, ~1.4 MB) and save it in your submit directory: username@ap1 $ wget http://proxy.chtc.wisc.edu/SQUID/osgschool20/intro-2.1-words.txt Create a submit file for the freq.py executable. Add a line called transfer_input_files = to tell HTCondor to transfer the input file to the job: transfer_input_files = intro-2.1-words.txt As with all submit file commands, it does not matter where this line goes, as long as it comes before the word queue . Since we want HTCondor to pass an argument to our Python executable, we need to remember to add an arguments = line in our submit file so that HTCondor knows to pass an argument to the job. Set this arguments = line equal to the argument to the Python script (i.e., the name the input file). Submit the job to HTCondor, wait for it to finish, and check the output! If things do not work the first time, keep trying! At this point in the exercises, we are telling you less and less explicitly how to do steps that you have done before. If you get stuck, ask for help in the Slack channel. Note If you want to transfer more than one input file, list all of them on a single transfer_input_files command, separated by commas. For example, if there are three input files: transfer_input_files = a.txt, b.txt, c.txt Transferring Output Files \u00b6 So far, we have relied on programs that send their output to the standard output and error streams, which HTCondor captures, saves, and returns back to the submit directory. But what if your program writes one or more files for its output? How do you tell HTCondor to bring them back? Let\u2019s start by exploring what happens to files that a job creates in the sandbox. We will use a very simple method for creating a new file: we will copy an input file to another name. Find or create a small input file (it is fine to use any small file from a previous exercise). Create a submit file that transfers the input file and copies it to another name (as if doing /bin/cp input.txt output.txt on the command line) Make the output filename different than any filenames that are in your submit directory What is the executable line? What is the arguments line? How do you tell HTCondor to transfer the input file? As always, use output , error , and log filenames that are different from previous exercises Submit the job and wait for it to finish. What happened? Can you tell what HTCondor did with the output file that was created (did it end up back on the access point?), after it was created in the job sandbox? Look carefully at the list of files in your submit directory now. Transferring Specific Output Files \u00b6 As you saw in the last exercise, by default HTCondor transfers files that are created in the job sandbox back to the submit directory when the job finishes. In fact, HTCondor will also transfer back changed input files, too. But, this only works for files that are in the top-level sandbox directory, and not for ones contained in subdirectories. What if you want to bring back only some output files, or output files contained in subdirectories? Here is a shell script that creates several files, including a copy of an input file in a new subdirectory: #!/bin/sh if [ $# -ne 1 ] ; then echo \"Usage: $0 INPUT\" ; exit 1 ; fi date > output-timestamp.txt cal > output-calendar.txt mkdir subdirectory cp $1 subdirectory/backup- $1 First, let\u2019s confirm that HTCondor does not bring back the output file (which starts with the prefix backup- ) in the subdirectory: Create a file called output.sh and save the above shell script in this file. Write a submit file that transfers any input file and runs output.sh on it (remember to include an arguments = line and pass the input filename as an argument). Submit the job, wait for it to finish, and examine the contents of your submit directory. Suppose you decide that you want only the timestamp output file and all files in the subdirectory, but not the calendar output file. You can tell HTCondor to only transfer these specific files back to the submission directory using transfer_output_files = : transfer_output_files = output-timestamp.txt, subdirectory/ When using transfer_output_files = , HTCondor will only transfer back the files you name - all other files will be ignored and deleted at the end of a job. Note See the trailing slash ( / ) on the subdirectory? That tells HTCondor to transfer back the files contained in the subdirectory, but not the directory itself ; the files will be written directly into the submit directory. If you want HTCondor to transfer back an entire directory, leave off the trailing slash. Remove all output files from the previous run, including output-timestamp.txt and output-calendar.txt . Copy the previous submit file that ran output.sh and add the transfer_output_files line from above. Submit the job, wait for it to finish, and examine the contents of your submit directory. Did it work as you expected? Thinking About Progress So Far \u00b6 At this point, you can do just about everything that you need in order to run jobs on a HTC pool. You can identify the executable, arguments, and input files, and you can get output back from the job. This is a big achievement! References \u00b6 There are many more details about HTCondor\u2019s file transfer mechanism not covered here. For more information, read \"Submitting Jobs Without a Shared Filesystem\" in the HTCondor Manual.","title":"2.1 - Work with input and output files"},{"location":"materials/htcondor/part2-ex1-files/#htc-exercise-21-work-with-input-and-output-files","text":"","title":"HTC Exercise 2.1: Work With Input and Output Files"},{"location":"materials/htcondor/part2-ex1-files/#exercise-goal","text":"The goal of this exercise is make input files available to your job on the execute machine and to return output files back created in your job back to you on the access point. This small change significantly adds to the kinds of jobs that you can run.","title":"Exercise Goal"},{"location":"materials/htcondor/part2-ex1-files/#viewing-a-job-sandbox","text":"Before you learn to transfer files to and from your job, it is good to understand a bit more about the environment in which your job runs. When the HTCondor starter process prepares to run your job, it creates a new directory for your job and all of its files. We call this directory the job sandbox , because it is your job\u2019s private space to play. Let\u2019s see what is in the job sandbox for a minimal job with no special input or output files. Save the script below in a file named sandbox.sh : #!/bin/sh echo 'Date: ' ` date ` echo 'Host: ' ` hostname ` echo 'Sandbox: ' ` pwd ` ls -alF # END Create a submit file for this script and submit it. When the job finishes, look at the contents of the output file. In the output file, note the Sandbox: line: That is the full path to your job sandbox for the run. It was created just for your job, and it was removed as soon as your job finished. Next, look at the output that appears after the Sandbox: line; it is the output from the ls command in the script. It shows all of the files in your job sandbox, as they existed at the end of the execution of sandbox.sh . The number of files that you see can change depending on the HTC system you are using, but some of the files you should always see are: .chirp.config Configuration for an advanced feature sandbox.sh Your executable .job.ad The job ClassAd .machine.ad The machine ClassAd _condor_stderr Saved standard error from the job _condor_stdout Saved standard output from the job tmp/ , var/tmp/ Directories in which to put temporary files So, HTCondor wrote copies of the job and machine ads (for use by the job, if desired), transferred your executable ( sandbox.sh ), ran it, and saved its standard output and standard error into files. Notice that your submit file, which was in the same directory on the access point machine as your executable, was not transferred, nor were any other files that happened to be in directory with the submit file. Now that we know something about the sandbox, we can transfer more files to and from it.","title":"Viewing a Job Sandbox"},{"location":"materials/htcondor/part2-ex1-files/#running-a-job-with-input-files","text":"Next, you will run a job that requires an input file. Remember, the initial job sandbox will contain only the job executable, unless you tell HTCondor explicitly about every other file that needs to be transferred to the job. Here is a Python script that takes the name of an input file (containing one word per line) from the command line, counts the number of times each (lowercased) word occurs in the text, and prints out the final list of words and their counts. #!/usr/bin/env python3 import os import sys if len ( sys . argv ) != 2 : print ( f 'Usage: { os . path . basename ( sys . argv [ 0 ]) } DATA' ) sys . exit ( 1 ) input_filename = sys . argv [ 1 ] words = {} with open ( input_filename , 'r' , encoding = 'iso-8859-1' ) as my_file : for line in my_file : word = line . strip () . lower () if word in words : words [ word ] += 1 else : words [ word ] = 1 for word in sorted ( words . keys ()): print ( f ' { words [ word ] : 8d } { word } ' ) Create and save the Python script in a file named freq.py . Download the input file for the script (263K lines, ~1.4 MB) and save it in your submit directory: username@ap1 $ wget http://proxy.chtc.wisc.edu/SQUID/osgschool20/intro-2.1-words.txt Create a submit file for the freq.py executable. Add a line called transfer_input_files = to tell HTCondor to transfer the input file to the job: transfer_input_files = intro-2.1-words.txt As with all submit file commands, it does not matter where this line goes, as long as it comes before the word queue . Since we want HTCondor to pass an argument to our Python executable, we need to remember to add an arguments = line in our submit file so that HTCondor knows to pass an argument to the job. Set this arguments = line equal to the argument to the Python script (i.e., the name the input file). Submit the job to HTCondor, wait for it to finish, and check the output! If things do not work the first time, keep trying! At this point in the exercises, we are telling you less and less explicitly how to do steps that you have done before. If you get stuck, ask for help in the Slack channel. Note If you want to transfer more than one input file, list all of them on a single transfer_input_files command, separated by commas. For example, if there are three input files: transfer_input_files = a.txt, b.txt, c.txt","title":"Running a Job With Input Files"},{"location":"materials/htcondor/part2-ex1-files/#transferring-output-files","text":"So far, we have relied on programs that send their output to the standard output and error streams, which HTCondor captures, saves, and returns back to the submit directory. But what if your program writes one or more files for its output? How do you tell HTCondor to bring them back? Let\u2019s start by exploring what happens to files that a job creates in the sandbox. We will use a very simple method for creating a new file: we will copy an input file to another name. Find or create a small input file (it is fine to use any small file from a previous exercise). Create a submit file that transfers the input file and copies it to another name (as if doing /bin/cp input.txt output.txt on the command line) Make the output filename different than any filenames that are in your submit directory What is the executable line? What is the arguments line? How do you tell HTCondor to transfer the input file? As always, use output , error , and log filenames that are different from previous exercises Submit the job and wait for it to finish. What happened? Can you tell what HTCondor did with the output file that was created (did it end up back on the access point?), after it was created in the job sandbox? Look carefully at the list of files in your submit directory now.","title":"Transferring Output Files"},{"location":"materials/htcondor/part2-ex1-files/#transferring-specific-output-files","text":"As you saw in the last exercise, by default HTCondor transfers files that are created in the job sandbox back to the submit directory when the job finishes. In fact, HTCondor will also transfer back changed input files, too. But, this only works for files that are in the top-level sandbox directory, and not for ones contained in subdirectories. What if you want to bring back only some output files, or output files contained in subdirectories? Here is a shell script that creates several files, including a copy of an input file in a new subdirectory: #!/bin/sh if [ $# -ne 1 ] ; then echo \"Usage: $0 INPUT\" ; exit 1 ; fi date > output-timestamp.txt cal > output-calendar.txt mkdir subdirectory cp $1 subdirectory/backup- $1 First, let\u2019s confirm that HTCondor does not bring back the output file (which starts with the prefix backup- ) in the subdirectory: Create a file called output.sh and save the above shell script in this file. Write a submit file that transfers any input file and runs output.sh on it (remember to include an arguments = line and pass the input filename as an argument). Submit the job, wait for it to finish, and examine the contents of your submit directory. Suppose you decide that you want only the timestamp output file and all files in the subdirectory, but not the calendar output file. You can tell HTCondor to only transfer these specific files back to the submission directory using transfer_output_files = : transfer_output_files = output-timestamp.txt, subdirectory/ When using transfer_output_files = , HTCondor will only transfer back the files you name - all other files will be ignored and deleted at the end of a job. Note See the trailing slash ( / ) on the subdirectory? That tells HTCondor to transfer back the files contained in the subdirectory, but not the directory itself ; the files will be written directly into the submit directory. If you want HTCondor to transfer back an entire directory, leave off the trailing slash. Remove all output files from the previous run, including output-timestamp.txt and output-calendar.txt . Copy the previous submit file that ran output.sh and add the transfer_output_files line from above. Submit the job, wait for it to finish, and examine the contents of your submit directory. Did it work as you expected?","title":"Transferring Specific Output Files"},{"location":"materials/htcondor/part2-ex1-files/#thinking-about-progress-so-far","text":"At this point, you can do just about everything that you need in order to run jobs on a HTC pool. You can identify the executable, arguments, and input files, and you can get output back from the job. This is a big achievement!","title":"Thinking About Progress So Far"},{"location":"materials/htcondor/part2-ex1-files/#references","text":"There are many more details about HTCondor\u2019s file transfer mechanism not covered here. For more information, read \"Submitting Jobs Without a Shared Filesystem\" in the HTCondor Manual.","title":"References"},{"location":"materials/htcondor/part2-ex2-queue-n/","text":"pre em { font-style: normal; background-color: yellow; } pre strong { font-style: normal; font-weight: bold; color: \\#008; } HTC Exercise 2.2: Use queue N , $(Cluster), and $(Process) \u00b6 Background \u00b6 Suppose you have a program that you want to run many times with different arguments each time. With what you know so far, you have a couple of choices: Write one submit file; submit one job, change the argument in the submit file, submit another job, change the submit file, \u2026 Write many submit files that are nearly identical except for the program argument Neither of these options seems very satisfying. Fortunately, HTCondor's queue statement is here to help! Exercise Goal \u00b6 The goal of the next several exercises is to learn to submit many jobs from a single HTCondor queue statement, and to control things like filenames and arguments on a per-job basis when doing so. Running Many Jobs With One queue Statement \u00b6 Example Here is a C program that uses a stochastic (random) method to estimate the value of \u03c0. The single argument to the program is the number of samples to take. More samples should result in better estimates! #include #include #include int main ( int argc , char * argv []) { struct timeval my_timeval ; int iterations = 0 ; int inside_circle = 0 ; int i ; double x , y , pi_estimate ; gettimeofday ( & my_timeval , NULL ); srand48 ( my_timeval . tv_sec ^ my_timeval . tv_usec ); if ( argc == 2 ) { iterations = atoi ( argv [ 1 ]); } else { printf ( \"usage: circlepi ITERATIONS \\n \" ); exit ( 1 ); } for ( i = 0 ; i < iterations ; i ++ ) { x = ( drand48 () - 0.5 ) * 2.0 ; y = ( drand48 () - 0.5 ) * 2.0 ; if ((( x * x ) + ( y * y )) <= 1.0 ) { inside_circle ++ ; } } pi_estimate = 4.0 * (( double ) inside_circle / ( double ) iterations ); printf ( \"%d iterations, %d inside; pi = %f \\n \" , iterations , inside_circle , pi_estimate ); return 0 ; } In a new directory for this exercise, create and save the code to a file named circlepi.c Compile the code (we will cover this in more detail during the Software lecture): username@ap1 $ gcc -o circlepi circlepi.c Test the program with just 1000 samples: username@ap1 $ ./circlepi 1000 Now suppose that you want to run the program many times, to produce many estimates. To do so, we can tell HTCondor how many jobs to \"queue up\" via the queue statement we've been putting at the end of each of our submit files. Let\u2019s see how it works: Write a normal submit file for this program Pass 1 million ( 1000000 ) as the command line argument to circlepi Make sure to include log , output , and error (with filenames like circlepi.log ), and request_* lines At the end of the file, write queue 3 instead of just queue (\"queue 3 jobs\" vs. \"queue a job\"). Submit the file. Note the slightly different message from condor_submit : 3 job(s) submitted to cluster *NNNN*. Before the jobs execute, look at the job queue to see the multiple jobs Here is some sample condor_q -nobatch output: ID OWNER SUBMITTED RUN_TIME ST PRI SIZE CMD 10228.0 cat 7/25 11:57 0+00:00:00 I 0 0.7 circlepi 1000000000 10228.1 cat 7/25 11:57 0+00:00:00 I 0 0.7 circlepi 1000000000 10228.2 cat 7/25 11:57 0+00:00:00 I 0 0.7 circlepi 1000000000 In this sample, all three jobs are part of cluster 10228 , but the first job was assigned process 0 , the second job was assigned process 1 , and the third one was assigned process 2 . (Programmers like to start counting from 0.) Now we can understand what the first column in the output, the job ID , represents. It is a job\u2019s cluster number , a dot ( . ), and the job\u2019s process number . So in the example above, the job ID of the second job is 10228.1 . Pop Quiz: Do you remember how to ask HTCondor's queue to list the status of all of the jobs from one cluster? How about one specific job ID? Using queue N With Output \u00b6 When all three jobs in your single cluster are finished, examine the resulting files. What is in the output file? What is in the error file? (hopefully it is empty!) What is in the log file? Look carefully at the job IDs in each event. Is this what you expected? Is it what you wanted? If the output is not what you expected, what do you think happened? Using $(Process) to Distinguish Jobs \u00b6 As you saw with the experiment above, each job ended up overwriting the same output and error filenames in the submission directory. After all, we didn't tell it to behave any differently when it ran three jobs. We need a way to separate output (and error) files per job that is queued , not just for the whole cluster of jobs. Fortunately, HTCondor has a way to separate the files easily. When processing a submit file, HTCondor will replace any instance of $(Process) with the process number of the job, for each job that is queued. For example, you can use the $(Process) variable to define a separate output file name for each job: output = my-output-file-$(Process).out queue 10 Even though the output filename is defined only once, HTCondor will create separate output filenames for each job: First job my-output-file-0.out Second job my-output-file-1.out Third job my-output-file-2.out ... ... Last (tenth) job my-output-file-9.out Let\u2019s see how this works for our program that estimates \u03c0. In your submit file, change the definitions of output and error to use $(Process) in the filename, similar to the example above. Delete any standard output, standard error, and log files from previous runs. Submit the updated file. When all three jobs are finished, examine the resulting files again. How many files are there of each type? What are their names? Is this what you expected? Is it what you wanted from the \u03c0 estimation process? Using $(Cluster) to Separate Files Across Runs \u00b6 With $(Process) , you can get separate output (and error) filenames for each job within a run. However, the next time you submit the same file, all of the output and error files are overwritten by new ones created by the new jobs. Maybe this is the behavior that you want. But sometimes, you may want to separate files by run, as well. In addition to $(Process) , there is also a $(Cluster) variable that you can use in your submit files. It works just like $(Process) , except it is replaced with the cluster number of the entire submission. Because the cluster number is the same for all jobs within a single submission, it does not separate files by job within a submission. But when used with $(Process) , it can be used to separate files by run. For example, consider this output statement: output = my-output-file-$(Cluster)-$(Process).out For one particular run, it might result in output filenames like my-output-file-2444-0.out , myoutput-file-2444-1.out , myoutput-file-2444-2.out , etc. However, the next run would have different filenames, replacing 2444 with the new Cluster number of that run. Using $(Process) and $(Cluster) in Other Statements \u00b6 The $(Cluster) and $(Process) variables can be used in any submit file statement, although they are useful in some kinds of submit file statements and not really for others. For example, consider using $(Cluster) or $(Process) in each of the below: log transfer_input_files transfer_output_files arguments Unfortunately, HTCondor does not easily let you perform math on the $(Process) number when using it. So, for example, if you use $(Process) as a numeric argument to a command, it will always result in jobs getting the arguments 0, 1, 2, and so on. If you have control over your program and the way in which it uses command-line arguments, then you are fine. Otherwise, you might need a solution like those in the next exercises. (Optional) Defining JobBatchName for Tracking \u00b6 It is possible to define arbitrary attributes in your submit file, and that one purpose of such attributes is to track or report on different jobs separately. In this optional exercise, you will see how this technique can be used. Once again, we will use sleep jobs, so that your jobs remain in the queue long enough to experiment on. Create a submit file that runs sleep 120 . Instead of a single queue statement, write this: jobbatchname = 1 queue 5 Submit the submit file to HTCondor. Now, quickly edit the submit file to instead say: jobbatchname = 2 Submit the file again. Check on the submissions using a normal condor_q and condor_q -nobatch . Of course, your special attribute does not appear in the condor_q -nobatch output, but it is present in the condor_q output and in each job\u2019s ClassAd. You can see the effect of the attribute by limiting your condor_q output to one type of job or another. First, run this command: username@ap1 $ condor_q -constraint 'JobBatchName == \"1\"' Do you get the output that you expected? Using the example command above, how would you list your other five jobs? (There will be more on how to use HTCondor constraints in later exercises.)","title":"2.2 - Use queue N, $(Cluster), and $(Process)"},{"location":"materials/htcondor/part2-ex2-queue-n/#htc-exercise-22-use-queue-n-cluster-and-process","text":"","title":"HTC Exercise 2.2: Use queue N, $(Cluster), and $(Process)"},{"location":"materials/htcondor/part2-ex2-queue-n/#background","text":"Suppose you have a program that you want to run many times with different arguments each time. With what you know so far, you have a couple of choices: Write one submit file; submit one job, change the argument in the submit file, submit another job, change the submit file, \u2026 Write many submit files that are nearly identical except for the program argument Neither of these options seems very satisfying. Fortunately, HTCondor's queue statement is here to help!","title":"Background"},{"location":"materials/htcondor/part2-ex2-queue-n/#exercise-goal","text":"The goal of the next several exercises is to learn to submit many jobs from a single HTCondor queue statement, and to control things like filenames and arguments on a per-job basis when doing so.","title":"Exercise Goal"},{"location":"materials/htcondor/part2-ex2-queue-n/#running-many-jobs-with-one-queue-statement","text":"Example Here is a C program that uses a stochastic (random) method to estimate the value of \u03c0. The single argument to the program is the number of samples to take. More samples should result in better estimates! #include #include #include int main ( int argc , char * argv []) { struct timeval my_timeval ; int iterations = 0 ; int inside_circle = 0 ; int i ; double x , y , pi_estimate ; gettimeofday ( & my_timeval , NULL ); srand48 ( my_timeval . tv_sec ^ my_timeval . tv_usec ); if ( argc == 2 ) { iterations = atoi ( argv [ 1 ]); } else { printf ( \"usage: circlepi ITERATIONS \\n \" ); exit ( 1 ); } for ( i = 0 ; i < iterations ; i ++ ) { x = ( drand48 () - 0.5 ) * 2.0 ; y = ( drand48 () - 0.5 ) * 2.0 ; if ((( x * x ) + ( y * y )) <= 1.0 ) { inside_circle ++ ; } } pi_estimate = 4.0 * (( double ) inside_circle / ( double ) iterations ); printf ( \"%d iterations, %d inside; pi = %f \\n \" , iterations , inside_circle , pi_estimate ); return 0 ; } In a new directory for this exercise, create and save the code to a file named circlepi.c Compile the code (we will cover this in more detail during the Software lecture): username@ap1 $ gcc -o circlepi circlepi.c Test the program with just 1000 samples: username@ap1 $ ./circlepi 1000 Now suppose that you want to run the program many times, to produce many estimates. To do so, we can tell HTCondor how many jobs to \"queue up\" via the queue statement we've been putting at the end of each of our submit files. Let\u2019s see how it works: Write a normal submit file for this program Pass 1 million ( 1000000 ) as the command line argument to circlepi Make sure to include log , output , and error (with filenames like circlepi.log ), and request_* lines At the end of the file, write queue 3 instead of just queue (\"queue 3 jobs\" vs. \"queue a job\"). Submit the file. Note the slightly different message from condor_submit : 3 job(s) submitted to cluster *NNNN*. Before the jobs execute, look at the job queue to see the multiple jobs Here is some sample condor_q -nobatch output: ID OWNER SUBMITTED RUN_TIME ST PRI SIZE CMD 10228.0 cat 7/25 11:57 0+00:00:00 I 0 0.7 circlepi 1000000000 10228.1 cat 7/25 11:57 0+00:00:00 I 0 0.7 circlepi 1000000000 10228.2 cat 7/25 11:57 0+00:00:00 I 0 0.7 circlepi 1000000000 In this sample, all three jobs are part of cluster 10228 , but the first job was assigned process 0 , the second job was assigned process 1 , and the third one was assigned process 2 . (Programmers like to start counting from 0.) Now we can understand what the first column in the output, the job ID , represents. It is a job\u2019s cluster number , a dot ( . ), and the job\u2019s process number . So in the example above, the job ID of the second job is 10228.1 . Pop Quiz: Do you remember how to ask HTCondor's queue to list the status of all of the jobs from one cluster? How about one specific job ID?","title":"Running Many Jobs With One queue Statement"},{"location":"materials/htcondor/part2-ex2-queue-n/#using-queue-n-with-output","text":"When all three jobs in your single cluster are finished, examine the resulting files. What is in the output file? What is in the error file? (hopefully it is empty!) What is in the log file? Look carefully at the job IDs in each event. Is this what you expected? Is it what you wanted? If the output is not what you expected, what do you think happened?","title":"Using queue N With Output"},{"location":"materials/htcondor/part2-ex2-queue-n/#using-process-to-distinguish-jobs","text":"As you saw with the experiment above, each job ended up overwriting the same output and error filenames in the submission directory. After all, we didn't tell it to behave any differently when it ran three jobs. We need a way to separate output (and error) files per job that is queued , not just for the whole cluster of jobs. Fortunately, HTCondor has a way to separate the files easily. When processing a submit file, HTCondor will replace any instance of $(Process) with the process number of the job, for each job that is queued. For example, you can use the $(Process) variable to define a separate output file name for each job: output = my-output-file-$(Process).out queue 10 Even though the output filename is defined only once, HTCondor will create separate output filenames for each job: First job my-output-file-0.out Second job my-output-file-1.out Third job my-output-file-2.out ... ... Last (tenth) job my-output-file-9.out Let\u2019s see how this works for our program that estimates \u03c0. In your submit file, change the definitions of output and error to use $(Process) in the filename, similar to the example above. Delete any standard output, standard error, and log files from previous runs. Submit the updated file. When all three jobs are finished, examine the resulting files again. How many files are there of each type? What are their names? Is this what you expected? Is it what you wanted from the \u03c0 estimation process?","title":"Using $(Process) to Distinguish Jobs"},{"location":"materials/htcondor/part2-ex2-queue-n/#using-cluster-to-separate-files-across-runs","text":"With $(Process) , you can get separate output (and error) filenames for each job within a run. However, the next time you submit the same file, all of the output and error files are overwritten by new ones created by the new jobs. Maybe this is the behavior that you want. But sometimes, you may want to separate files by run, as well. In addition to $(Process) , there is also a $(Cluster) variable that you can use in your submit files. It works just like $(Process) , except it is replaced with the cluster number of the entire submission. Because the cluster number is the same for all jobs within a single submission, it does not separate files by job within a submission. But when used with $(Process) , it can be used to separate files by run. For example, consider this output statement: output = my-output-file-$(Cluster)-$(Process).out For one particular run, it might result in output filenames like my-output-file-2444-0.out , myoutput-file-2444-1.out , myoutput-file-2444-2.out , etc. However, the next run would have different filenames, replacing 2444 with the new Cluster number of that run.","title":"Using $(Cluster) to Separate Files Across Runs"},{"location":"materials/htcondor/part2-ex2-queue-n/#using-process-and-cluster-in-other-statements","text":"The $(Cluster) and $(Process) variables can be used in any submit file statement, although they are useful in some kinds of submit file statements and not really for others. For example, consider using $(Cluster) or $(Process) in each of the below: log transfer_input_files transfer_output_files arguments Unfortunately, HTCondor does not easily let you perform math on the $(Process) number when using it. So, for example, if you use $(Process) as a numeric argument to a command, it will always result in jobs getting the arguments 0, 1, 2, and so on. If you have control over your program and the way in which it uses command-line arguments, then you are fine. Otherwise, you might need a solution like those in the next exercises.","title":"Using $(Process) and $(Cluster) in Other Statements"},{"location":"materials/htcondor/part2-ex2-queue-n/#optional-defining-jobbatchname-for-tracking","text":"It is possible to define arbitrary attributes in your submit file, and that one purpose of such attributes is to track or report on different jobs separately. In this optional exercise, you will see how this technique can be used. Once again, we will use sleep jobs, so that your jobs remain in the queue long enough to experiment on. Create a submit file that runs sleep 120 . Instead of a single queue statement, write this: jobbatchname = 1 queue 5 Submit the submit file to HTCondor. Now, quickly edit the submit file to instead say: jobbatchname = 2 Submit the file again. Check on the submissions using a normal condor_q and condor_q -nobatch . Of course, your special attribute does not appear in the condor_q -nobatch output, but it is present in the condor_q output and in each job\u2019s ClassAd. You can see the effect of the attribute by limiting your condor_q output to one type of job or another. First, run this command: username@ap1 $ condor_q -constraint 'JobBatchName == \"1\"' Do you get the output that you expected? Using the example command above, how would you list your other five jobs? (There will be more on how to use HTCondor constraints in later exercises.)","title":"(Optional) Defining JobBatchName for Tracking"},{"location":"materials/htcondor/part2-ex3-queue-from/","text":"pre em { font-style: normal; background-color: yellow; } pre strong { font-style: normal; font-weight: bold; color: \\#008; } HTC Exercise 2.3: Submit with \u201cqueue from\u201d \u00b6 Exercise Goals \u00b6 In this exercise and the next one, you will explore more ways to use a single submit file to submit many jobs . The goal of this exercise is to submit many jobs from a single submit file by using the queue ... from syntax to read variable values from a file. Background \u00b6 In all cases of submitting many jobs from a single submit file, the key questions are: What makes each job unique? In other words, there is one job per _____? So, how should you tell HTCondor to distinguish each job? For queue *N* , jobs are distinguished simply by the built-in \"process\" variable. But with the remaining queue forms, you help HTCondor distinguish jobs by other, more meaningful custom variables. Counting Words in Files \u00b6 Imagine you have a collection of books, and you want to analyze how word usage varies from book to book or author to author. As mentioned in the lecture, HTCondor provides many ways to submit jobs for this task. You could create a separate submit file for each book, and submit all of the files manually, but you'd have a lot of file lines to modify each time (in particular, all five of the last lines before queue below): executable = freq.py request_memory = 1GB request_disk = 20MB should_transfer_files = YES when_to_transfer_output = ON_EXIT transfer_input_files = AAiW.txt arguments = AAiW.txt output = AAiW.out error = AAiW.err log = AAiW.log queue This would be overly verbose and tedious. Let's do better. Queue Jobs From a List of Values \u00b6 Suppose we want to modify our word-frequency analysis from a previous exercise so that it outputs only the most common N words of a document. However, we want to experiment with different values of N . For this analysis, we will have a new version of the word-frequency counting script. First, we need a new version of the word counting program so that it accepts an extra number as a command line argument and outputs only that many of the most common words. Here is the new code (it's still not important that you understand this code): #!/usr/bin/env python3 import os import sys import operator if len ( sys . argv ) != 3 : print ( f 'Usage: { os . path . basename ( sys . argv [ 0 ]) } DATA NUM_WORDS' ) sys . exit ( 1 ) input_filename = sys . argv [ 1 ] num_words = int ( sys . argv [ 2 ]) words = {} with open ( input_filename , 'r' ) as my_file : for line in my_file : line_words = line . split () for word in line_words : if word in words : words [ word ] += 1 else : words [ word ] = 1 sorted_words = sorted ( words . items (), key = operator . itemgetter ( 1 )) for word in sorted_words [ - num_words :]: print ( f ' { word [ 0 ] } { word [ 1 ] : 8d } ' ) To submit this program with a collection of two variable values for each run, one for the number of top words and one for the filename: Save the script as wordcount-top-n.py . Download and unpack some books from Project Gutenberg: user@ap1 $ wget http://proxy.chtc.wisc.edu/SQUID/osgschool20/books.zip user@ap1 $ unzip books.zip Create a new submit file (or base it off a previous one!) named wordcount-top.sub , including memory and disk requests of 20 MB. All of the jobs will use the same executable and log statements. Update other statements to work with two variables, book and n : output = $(book)_top_$(n).out error = $(book)_top_$(n).err transfer_input_files = $(book) arguments = \"$(book) $(n)\" queue book, n from books_n.txt Note especially the changes to the queue statement; it now tells HTCondor to read a separate text file of pairs of values, which will be assigned to book and n respectively. Create the separate text file of job variable values and save it as books_n.txt : AAiW.txt, 10 AAiW.txt, 25 AAiW.txt, 50 PandP.txt, 10 PandP.txt, 25 PandP.txt, 50 TAoSH.txt, 10 TAoSH.txt, 25 TAoSH.txt, 50 Note that we used 3 different values for n for each book. Submit the file Do a quick sanity check: How many jobs were submitted? How many log, output, and error files were created? Extra Challenge 1 \u00b6 You may have noticed that the output of these jobs has a messy naming convention. Because our macros resolve to the filenames, including their extension (e.g., AAiW.txt ), the output filenames contain with multiple extensions (e.g., AAiW.txt.err ). Although the extra extension is acceptable, it makes the filenames harder to read and possibly organize. Change your submit file and variable file for this exercise so that the output filenames do not include the .txt extension.","title":"2.3 - Use queue from with custom variables"},{"location":"materials/htcondor/part2-ex3-queue-from/#htc-exercise-23-submit-with-queue-from","text":"","title":"HTC Exercise 2.3: Submit with \u201cqueue from\u201d"},{"location":"materials/htcondor/part2-ex3-queue-from/#exercise-goals","text":"In this exercise and the next one, you will explore more ways to use a single submit file to submit many jobs . The goal of this exercise is to submit many jobs from a single submit file by using the queue ... from syntax to read variable values from a file.","title":"Exercise Goals"},{"location":"materials/htcondor/part2-ex3-queue-from/#background","text":"In all cases of submitting many jobs from a single submit file, the key questions are: What makes each job unique? In other words, there is one job per _____? So, how should you tell HTCondor to distinguish each job? For queue *N* , jobs are distinguished simply by the built-in \"process\" variable. But with the remaining queue forms, you help HTCondor distinguish jobs by other, more meaningful custom variables.","title":"Background"},{"location":"materials/htcondor/part2-ex3-queue-from/#counting-words-in-files","text":"Imagine you have a collection of books, and you want to analyze how word usage varies from book to book or author to author. As mentioned in the lecture, HTCondor provides many ways to submit jobs for this task. You could create a separate submit file for each book, and submit all of the files manually, but you'd have a lot of file lines to modify each time (in particular, all five of the last lines before queue below): executable = freq.py request_memory = 1GB request_disk = 20MB should_transfer_files = YES when_to_transfer_output = ON_EXIT transfer_input_files = AAiW.txt arguments = AAiW.txt output = AAiW.out error = AAiW.err log = AAiW.log queue This would be overly verbose and tedious. Let's do better.","title":"Counting Words in Files"},{"location":"materials/htcondor/part2-ex3-queue-from/#queue-jobs-from-a-list-of-values","text":"Suppose we want to modify our word-frequency analysis from a previous exercise so that it outputs only the most common N words of a document. However, we want to experiment with different values of N . For this analysis, we will have a new version of the word-frequency counting script. First, we need a new version of the word counting program so that it accepts an extra number as a command line argument and outputs only that many of the most common words. Here is the new code (it's still not important that you understand this code): #!/usr/bin/env python3 import os import sys import operator if len ( sys . argv ) != 3 : print ( f 'Usage: { os . path . basename ( sys . argv [ 0 ]) } DATA NUM_WORDS' ) sys . exit ( 1 ) input_filename = sys . argv [ 1 ] num_words = int ( sys . argv [ 2 ]) words = {} with open ( input_filename , 'r' ) as my_file : for line in my_file : line_words = line . split () for word in line_words : if word in words : words [ word ] += 1 else : words [ word ] = 1 sorted_words = sorted ( words . items (), key = operator . itemgetter ( 1 )) for word in sorted_words [ - num_words :]: print ( f ' { word [ 0 ] } { word [ 1 ] : 8d } ' ) To submit this program with a collection of two variable values for each run, one for the number of top words and one for the filename: Save the script as wordcount-top-n.py . Download and unpack some books from Project Gutenberg: user@ap1 $ wget http://proxy.chtc.wisc.edu/SQUID/osgschool20/books.zip user@ap1 $ unzip books.zip Create a new submit file (or base it off a previous one!) named wordcount-top.sub , including memory and disk requests of 20 MB. All of the jobs will use the same executable and log statements. Update other statements to work with two variables, book and n : output = $(book)_top_$(n).out error = $(book)_top_$(n).err transfer_input_files = $(book) arguments = \"$(book) $(n)\" queue book, n from books_n.txt Note especially the changes to the queue statement; it now tells HTCondor to read a separate text file of pairs of values, which will be assigned to book and n respectively. Create the separate text file of job variable values and save it as books_n.txt : AAiW.txt, 10 AAiW.txt, 25 AAiW.txt, 50 PandP.txt, 10 PandP.txt, 25 PandP.txt, 50 TAoSH.txt, 10 TAoSH.txt, 25 TAoSH.txt, 50 Note that we used 3 different values for n for each book. Submit the file Do a quick sanity check: How many jobs were submitted? How many log, output, and error files were created?","title":"Queue Jobs From a List of Values"},{"location":"materials/htcondor/part2-ex3-queue-from/#extra-challenge-1","text":"You may have noticed that the output of these jobs has a messy naming convention. Because our macros resolve to the filenames, including their extension (e.g., AAiW.txt ), the output filenames contain with multiple extensions (e.g., AAiW.txt.err ). Although the extra extension is acceptable, it makes the filenames harder to read and possibly organize. Change your submit file and variable file for this exercise so that the output filenames do not include the .txt extension.","title":"Extra Challenge 1"},{"location":"materials/htcondor/part2-ex4-queue-matching/","text":"pre em { font-style: normal; background-color: yellow; } pre strong { font-style: normal; font-weight: bold; color: \\#008; } Bonus HTC Exercise 2.4: Submit With \u201cqueue matching\u201d \u00b6 Exercise Goal \u00b6 The goal of this exercise is to submit many jobs from a single submit file by using the queue ... matching syntax to submit jobs with variable values derived from files in the current directory which match a specified pattern. Counting Words in Files \u00b6 Returning to our book word-counting example, let's pretend that instead of three books, we have an entire library. While we could list all of the text files in a books.txt file and use queue book from books.txt , it could be a tedious process, especially for tens of thousands of files. Luckily HTCondor provides a mechanism for submitting jobs based on pattern-matched files. Queue Jobs By Matching Filenames \u00b6 This is an example of a common scenario: We want to run one job per file, where the filenames match a certain consistent pattern. The queue ... matching statement is made for this scenario. Let\u2019s see this in action. First, here is a new version of the script (note, we removed the 'top n words' restriction): #!/usr/bin/env python3 import os import sys import operator if len ( sys . argv ) != 2 : print ( f 'Usage: { os . path . basename ( sys . argv [ 0 ]) } DATA' ) sys . exit ( 1 ) input_filename = sys . argv [ 1 ] words = {} with open ( input_filename , 'r' ) as my_file : for line in my_file : line_words = line . split () for word in line_words : if word in words : words [ word ] += 1 else : words [ word ] = 1 sorted_words = sorted ( words . items (), key = operator . itemgetter ( 1 )) for word in sorted_words : print ( f ' { word [ 0 ] } { word [ 1 ] : 8d } ' ) To use the script: Create and save this script as wordcount.py . Verify the script by running it on one book manually. Create a new submit file to submit one job (pick a book file and model your submit file off of the one above) Modify the following submit file statements to work for all books: transfer_input_files = $(book) arguments = $(book) output = $(book).out error = $(book).err queue book matching *.txt Note As always, the order of statements in a submit file does not matter, except that the queue statement should be last. Also note that any submit file variable name (here, book , but true for process and all others) may be used in any mixture of upper- and lowercase letters. Submit the jobs. HTCondor uses the queue ... matching statement to look for files in the submit directory that match the given pattern, then queues one job per match. For each job, the given variable (e.g., book here) is assigned the name of the matching file, so that it can be used in output , error , and other statements. The result is the same as if we had written out a much longer submit file: ... transfer_input_files = AAiW.txt arguments = \"AAiW.txt\" output = AAiW.txt.out error = AAiW.txt.err queue transfer_input_files = PandP.txt arguments = \"PandP.txt\" output = PandP.txt.out error = PandP.txt.err queue transfer_input_files = TAoSH.txt arguments = \"TAoSH.txt\" output = TAoSH.txt.out error = TAoSH.txt.err queue ... How many jobs were created? Is this what you expected? If you ran this in the same directory as Exercise 2.3, you may have noticed that a job was submitted for the books_n.txt file that holds the variable values in the queue from statement. Beware the dangers of matching more files than intended! One solution may be to put all of the books into an books directory and queue matching books/*.txt . Can you think of other solutions? If you have time, try one! Extra Challenge 1 \u00b6 In the example above, you used a single log file for all three jobs. HTCondor handles this situation with no problem; each job writes its events into the log file without getting in the way of other events and other jobs. But as you may have seen, it may be difficult for a person to understand the events for any particular job in the combined log file. Create a new submit file that works just like the one above, except that each job writes its own log file. Extra Challenge 2 \u00b6 Between this exercise and the previous one, you have explored two of the three primary queue statements. How would you use the queue in ... list statement to accomplish the same thing(s) as one or both of the exercises?","title":"Bonus Exercise 2.4 - Use queue matching with a custom variable"},{"location":"materials/htcondor/part2-ex4-queue-matching/#bonus-htc-exercise-24-submit-with-queue-matching","text":"","title":"Bonus HTC Exercise 2.4: Submit With \u201cqueue matching\u201d"},{"location":"materials/htcondor/part2-ex4-queue-matching/#exercise-goal","text":"The goal of this exercise is to submit many jobs from a single submit file by using the queue ... matching syntax to submit jobs with variable values derived from files in the current directory which match a specified pattern.","title":"Exercise Goal"},{"location":"materials/htcondor/part2-ex4-queue-matching/#counting-words-in-files","text":"Returning to our book word-counting example, let's pretend that instead of three books, we have an entire library. While we could list all of the text files in a books.txt file and use queue book from books.txt , it could be a tedious process, especially for tens of thousands of files. Luckily HTCondor provides a mechanism for submitting jobs based on pattern-matched files.","title":"Counting Words in Files"},{"location":"materials/htcondor/part2-ex4-queue-matching/#queue-jobs-by-matching-filenames","text":"This is an example of a common scenario: We want to run one job per file, where the filenames match a certain consistent pattern. The queue ... matching statement is made for this scenario. Let\u2019s see this in action. First, here is a new version of the script (note, we removed the 'top n words' restriction): #!/usr/bin/env python3 import os import sys import operator if len ( sys . argv ) != 2 : print ( f 'Usage: { os . path . basename ( sys . argv [ 0 ]) } DATA' ) sys . exit ( 1 ) input_filename = sys . argv [ 1 ] words = {} with open ( input_filename , 'r' ) as my_file : for line in my_file : line_words = line . split () for word in line_words : if word in words : words [ word ] += 1 else : words [ word ] = 1 sorted_words = sorted ( words . items (), key = operator . itemgetter ( 1 )) for word in sorted_words : print ( f ' { word [ 0 ] } { word [ 1 ] : 8d } ' ) To use the script: Create and save this script as wordcount.py . Verify the script by running it on one book manually. Create a new submit file to submit one job (pick a book file and model your submit file off of the one above) Modify the following submit file statements to work for all books: transfer_input_files = $(book) arguments = $(book) output = $(book).out error = $(book).err queue book matching *.txt Note As always, the order of statements in a submit file does not matter, except that the queue statement should be last. Also note that any submit file variable name (here, book , but true for process and all others) may be used in any mixture of upper- and lowercase letters. Submit the jobs. HTCondor uses the queue ... matching statement to look for files in the submit directory that match the given pattern, then queues one job per match. For each job, the given variable (e.g., book here) is assigned the name of the matching file, so that it can be used in output , error , and other statements. The result is the same as if we had written out a much longer submit file: ... transfer_input_files = AAiW.txt arguments = \"AAiW.txt\" output = AAiW.txt.out error = AAiW.txt.err queue transfer_input_files = PandP.txt arguments = \"PandP.txt\" output = PandP.txt.out error = PandP.txt.err queue transfer_input_files = TAoSH.txt arguments = \"TAoSH.txt\" output = TAoSH.txt.out error = TAoSH.txt.err queue ... How many jobs were created? Is this what you expected? If you ran this in the same directory as Exercise 2.3, you may have noticed that a job was submitted for the books_n.txt file that holds the variable values in the queue from statement. Beware the dangers of matching more files than intended! One solution may be to put all of the books into an books directory and queue matching books/*.txt . Can you think of other solutions? If you have time, try one!","title":"Queue Jobs By Matching Filenames"},{"location":"materials/htcondor/part2-ex4-queue-matching/#extra-challenge-1","text":"In the example above, you used a single log file for all three jobs. HTCondor handles this situation with no problem; each job writes its events into the log file without getting in the way of other events and other jobs. But as you may have seen, it may be difficult for a person to understand the events for any particular job in the combined log file. Create a new submit file that works just like the one above, except that each job writes its own log file.","title":"Extra Challenge 1"},{"location":"materials/htcondor/part2-ex4-queue-matching/#extra-challenge-2","text":"Between this exercise and the previous one, you have explored two of the three primary queue statements. How would you use the queue in ... list statement to accomplish the same thing(s) as one or both of the exercises?","title":"Extra Challenge 2"},{"location":"materials/osg/part1-ex1-login-scp/","text":"OSG Exercise 1.1: Log In to the OSPool Access Point \u00b6 The main goal of this exercise is to log in to an Open Science Pool Access Point so that you can start submitting jobs into the OSPool. But before doing that, you will first prepare a file on Monday\u2018s Access Point to copy to the OSPool Access Point. Then you will learn how to efficiently copy files between the Access Points. If you have trouble getting ssh access to the OSPool Access Point, ask the instructors right away! Gaining access is critical for all remaining exercises. Part 1: On the PATh Access Point \u00b6 The first few sections below are to be completed on ap1.facility.path-cc.io , the PATh Access Point. This is still the same Access Point you have been using since yesterday. Preparing files for transfer \u00b6 When transferring files between computers, it\u2019s best to limit the number of files as well as their size. Smaller files transfer more quickly and, if your network connection fails, restarting the transfer is less painful than it would be if you were transferring large files. Archiving tools (WinZip, 7zip, Archive Utility, etc.) can compress the size of your files and place them into a single, smaller archive file. The Unix tar command is a one-stop shop for creating, extracting, and viewing the contents of tar archives (called tarballs ). Its usage is as follows: To create a tarball named containing , use the following command: $ tar -czvf Where should end in .tar.gz and can be a list of any number of files and/or folders, separated by spaces. To extract the files from a tarball into the current directory: $ tar -xzvf To list the files within a tarball: $ tar -tzvf Comparing compressed sizes \u00b6 You can adjust the level of compression of tar by prepending your command with GZIP=-- , where can be either fast for the least compression, or best for the most compression (the default compression is between best and fast ). While still logged in to ap1.facility.path-cc.io : Create and change into a new folder for this exercise, for example osg-ex11 Use wget to download the following files from our web server: Text file: http://proxy.chtc.wisc.edu/SQUID/osgschool21/random_text Archive: http://proxy.chtc.wisc.edu/SQUID/osgschool21/pdbaa.tar.gz Image: http://proxy.chtc.wisc.edu/SQUID/osgschool21/obligatory_cat.jpg Use tar on each file and use ls -l to compare the sizes of the original file and the compressed version. Which files were compressed the least? Why? Part 2: On the Open Science Pool Access Point \u00b6 For many of the remaining exercises, you will be using an OSPool Access Point, ap40.uw.osg-htc.org , which submits jobs into the OSPool. To log in to the OSPool Access Point, use the same username (and SSH key, if you did that) as on ap1 . If you have any issues logging in to ap40.uw.osg-htc.org , please ask for help right away! So please ssh in to the server and take a look around: Log in using ssh USERNAME@ap40.uw.osg-htc.org (substitute your own username) Try some Linux and HTCondor commands; for example: Linux commands: hostname , pwd , ls , and so on What is the operating system? uname and (in this case) cat /etc/redhat-release HTCondor commands: condor_version , condor_q , condor_status -total Transferring files \u00b6 In the next exercise, you will submit the same kind of job as in the previous exercise. Wouldn\u2019t it be nice to copy the files instead of starting from scratch? And in general, being able to copy files between servers is helpful, so let\u2019s explore a way to do that. Using secure copy \u00b6 Secure copy ( scp ) is a command based on SSH that lets you securely copy files between two different servers. It takes similar arguments to the Unix cp command but also takes additional information about servers. Its general form is like this: scp