This tutorial describes a report that will read data in your file and calculate the average age at which males and females were married and the average age of fathers and mothers when their children were born. It reproduces the AppleScript report using the Ruby language. The report output is to a built-in GEDitCOM II report. When you are done this tutorial, you should be able to create your own custom reports using Python by changing the type of data collected and the format of the output report.
This tutorial was written prior to GEDitCOM II, version 2.0 where scripts were converted to extensions. The tutorial still works, but when creating your own reports and is preferrable to package the scripts in an extension. The details for packaging scripts in extensions are provided in the GEDitCOM Editor help.
#!/usr/bin/ruby # Generational Ages Report Script # 20 JUN 2010, by John A. Nairn # # This script generates a report of average ages of all spouses # when they got married and when their children were born. # The report can be for all spouses in the file or just for # spouses in the currently selected family records. # Prepare to use Apple's Scripting Bridge for Python require "osx/cocoa" include OSX OSX.require_framework 'ScriptingBridge' # Define the script name in a global variable scriptName="Generation Ages" ################### Subroutines (see below) ################### Main Script # fetch application object gedit = OSX::SBApplication.applicationWithBundleIdentifier_(\ "com.geditcom.GEDitCOMII") # verify document is open and version is acceptable if CheckAvailable(gedit,scriptName)==0 exit end # reference to the front document gdoc=gedit.documents[0] # choose all or currently selected family records whichOnes = gdoc.userOptionTitle_buttons_message_(\ "Get report for All or just Selected family records",\ ["All", "Cancel", "Selected"],nil) if whichOnes=="Cancel" exit end # Get of list of the chosen family records if whichOnes=="All" fams = gdoc.families() else selRecs = gdoc.selectedRecords() fams = [] selRecs.each do |fam| if fam.recordType()=="FAM" fams.push(fam) end end end # No report if no family records were found if fams.length==0 puts "No family records were selected" exit end # Collect all report data in a subroutine CollectAges(gdoc,fams) # write to report and then done WriteToReport(gdoc,gedit)
Listing 1 shows the entire main script, although crucial components of the script are done in subroutines that are given below. This section describes the logic of the main script.
The script starts with comment lines beginning in "#". It is a good idea to start all scripts with comments. If you share your scripts with other GEDitCOM II users or revisit a script written a while ago, these comments can document use of the script.
The Prepare to use section must start all Ruby scripts. These commands load modules needed to allow Ruby scripts to interact with GEDitCOM II. The scriptName
holds the name of the script. Any place you need to refer to the script by name, use this variable rather than literal text of the name. This approach will make parts of your script more reusable in other scripts.
The first step is to verify it makes sense to run this script. All the work is done in the CheckAvailable()
subroutine (see utility subroutines). The subroutine returns 1
if it is OK to proceed or 0
to exit. This script, for example, requires a document to be open. This script also requires version 1.5 or newer of GEDitCOM II (because it uses some commands first defined in version 1.5); you can verify version number when the script is packaged into an extension.
You will often want to run reports on your entire file. But, it can be helpful to focus a report on a subset of your file. To achieve this goal, many scripts will have an option to be on the entire file or on just the selected records. To run a report on a subset of the file, a user selects the records first and then runs the script. The next three sections let the user choose the report target. First, the user option
command displays a box with three buttons for "All", "Cancel", or "Selected" to report on the entire file, to abort the script, or to report on the currently selected records, respectively. The "All" option, which is first, is the default option (user can hit return
to use that option). The command is sent to the desired document using the gdoc
reference defined at the beginning of the script.
Once the user decides which records to use, the next section compiles all needed records into a list variable (fams
). This report is reading ages of fathers and mothers and thus only needs to look at family records. If the user selects "All", the list is found by reading families()
from the front document (gdoc
). If "Selected" is chosen instead, the script fetches the selectedRecords()
of the front document. A list of currently selectedRecords()
is a standard property of GEDitCOM II documents. This list may have any number or records (including none) and may have any type of record. Because this report only cares about family records, the each
loop goes through the list of selected records and adds only the family records to the fams
list variable.
Finally, once all family records are in the fams
list variable, the length of that list is checked (fams.length
). If it has no elements, there is no need to proceed and the script exits with a message that "No family records were selected". Otherwise the script continues.
The final section is the main part of the script, but all work is done in two subroutines. First the CollectAges()
subroutine extracts all needed age information from the provided list of family records and stores the results in global variables. Next, a WriteToReport()
subroutine formats the report for output to the user and the script is done
# Collect data for the generation ages report def CollectAges(gdoc,famList) # initialize global counters $numHusbAge=$sumHusbAge=$numFathAge=$sumFathAge=0 $numWifeAge=$sumWifeAge=$numMothAge=$sumMothAge=0 # progress reporting interval fractionStepSize=nextFraction=0.01 numFams=famList.length i=0 famList.each do |fam| # read family record information husbRef = fam.husband() wifeRef = fam.wife() chilList = fam.children() mdate = fam.marriageSDN() # read parent birth dates hbdate = wbdate = 0 if husbRef != "" hbdate = husbRef.birthSDN() end if wifeRef != "" wbdate = wifeRef.birthSDN() end # spouse ages at marriage if mdate>0 if hbdate>0 $sumHusbAge = $sumHusbAge + GetAgeSpan(hbdate,mdate) $numHusbAge = $numHusbAge+1 end if wbdate>0 $sumWifeAge = $sumWifeAge + GetAgeSpan(wbdate,mdate) $numWifeAge = $numWifeAge+1 end end # spouse ages when children were born if hbdate > 0 or wbdate > 0 chilList.each do |chilRef| cbdate = chilRef.birthSDN() if cbdate > 0 and hbdate > 0 $sumFathAge = $sumFathAge+GetAgeSpan(hbdate,cbdate) $numFathAge = $numFathAge + 1 end if cbdate > 0 and wbdate > 0 $sumMothAge = $sumMothAge+GetAgeSpan(wbdate,cbdate) $numMothAge = $numMothAge + 1 end end end # time for progress i = i+1 fractionDone = Float(i)/Float(numFams) if fractionDone > nextFraction gdoc.notifyProgressFraction_message_(fractionDone,nil) nextFraction = nextFraction+fractionStepSize end end end
This subroutine (see Listing 2) collects all data on ages from the information in your file. It is where most of the work of this script is done; the work is done by interaction with your data through GEDitCOM II's scripting objects and their properties.
The first section initializes global variables. These variables will be accessed elsewhere in the script to format the report, which is why they need to be global variables. The variables fractionStepSize
, nextFraction
, numFams
, and i
are local variables used for tracking progress of the scripts and are discussed more below.
The each
loop is over all family records passed to this subroutine. The loop starts by reading data from the family record - namely references to the husband and wife records (in husbRef
and wifeRef
), a list of all children records (in chilList
), and the marriage date (in mdate
). The marriage date, like all dates in this script, is read as a serial day number (using built in SDN
properties), which is a day number starting with 1 back around 4000 B.C.. Serial day numbers are ideal for date calculations such as finding years between dates. These SDN
attributes return the serial day number for a date or return 0 if the date is either not known or if the date in the file has an invalid date.
The next section reads the parents' birth dates. From above husbRef
and wifeRef
are references to the parents in this family or either could be an empty string meaning the record does not have that spouse. For each spouse that is in the family record, this section reads their birth serial day numbers using properties of their individual records, otherwise the dates will be zero.
The next two sections do the date calculations for this script. First are the calculations for ages of each parent at the time of marriage. This calculation can only be done if both a spouse's birth date and the family's marriage date are known. Thus if both serial day numbers are greater then zero, the age is calculated (using a utility method called GetAgeSpan()
). The global variables numHusbAge
and numWifeAge
count the number of age calculations done. The sumHusbAge
and sumWifeAge
variables hold a sum of all ages. When this subroutine is done, the sum
variable divided by the num
variable will be the average age.
The age at child birth section is similar. It contains a loop over all children in the family. For each child, it looks for their birth date. If a birth date is found, the ages of each parent with a known birth date are added to global variables analogous to the num
and sum
variables in the previous section. This entire section is enclosed in a conditional that says to do these calculations only if at least one parent birthdate is known.
The last section of the loop informs the user of the script progress using the notifyProgressFraction_message_()
command.
When the repeat loop is done, the global variables (e.g., numHusbAge
, sumHusbAge
, etc.) will contain all data needed to output the report. The subroutine ends and returns control to the main script. The next section explains formatting of the output report.
# Write the results now in the global variables to a # GEDitCOM II report def WriteToReport(gdoc,gedit) # build report using <html> elements beginning with <div> rpt = ["<div>\n"] # begin report with <h1> for title fname = gdoc.name() rpt.push("<h1>Generational Age Analysis in " + fname + "</h1>\n") # start <table> and give it a <caption> rpt.push("<table>\n<caption>\n") rpt.push("Summary of spouse ages when married and when children were born\n") rpt.push("</caption>\n") # column labels in the <thead> section rpt.push("<thead><tr>\n") rpt.push("<th>Age Item</th><th>Husband</th><th>Wife</th>\n") rpt.push("</tr></thead>\n") # the rows are in the <tbody> element rpt.push("<tbody>\n") # rows for ages when married and when children were borm rpt.push(InsertRow("Avg. Age at Marriage", $numHusbAge,\ $sumHusbAge, $numWifeAge, $sumWifeAge)) rpt.push(InsertRow("Avg. Age at Childbirth", $numFathAge,\ $sumFathAge, $numMothAge, $sumMothAge)) # end the <tbody> and <table> elements rpt.push("</tbody>\n</table>\n") rpt.push("</div>") # display the report theReport = rpt.join p = {"name"=>"Generational Ages","body"=>theReport} newReport = gedit.classForScriptingClass_("report").\ alloc().initWithProperties_(p) gdoc.reports.addObject_(newReport) newReport.showBrowser() end
Formatting a report for output in GEDitCOM II means to format the data using html
elements all enclosed within a single div
element. You can use any html
methods you want. Here the report title is put in an h1
section element and all results are placed in a table
element. The subroutine to create this report is in Listing 3.
The report is stored in the rpt
list variable. The script starts by creating a single element list variable with the <div>
element (and a return character). Each new text needed for the report will be added as another element at the end of the list using the push()
method. When done, the list is converted to a string variable with the command rpt.join
, which combines all elements one after another. An alternative method is to use a string variable. These two approaches, side-by-side are:
rpt = ["text 1"] rpt = "text 1"
rpt.append("text 2") rpt = rpt + "text 2"
... ...
theReport = rpt.join
The list version on the left is faster because adding an element to the end of a list is faster then combining a string with itself (e.g., rpt = rpt + "text 2"
) many times. For this small script the difference would not be noticeable, but it is good practice to use the most efficient methods whenever possible.
The process is straightforward, assuming you understand html
elements. A name for the report is put into an h1
element; the name includes the file name. All data is in a three-column table where the first column labels the data and the other two columns give results for husbands and wives. The table
starts with a caption
for the table. The thead
section has header rows to label the three columns. The tbody
has two rows to report results for average ages at marriages and average ages when children were born. These rows are formatted using a custom InsertRow()
subroutine. Finally, all elements are closed and the report ends with a </div>
element.
The final step is to send the report to a GEDitCOM II report and display the report to the user. The report is created with a initWithProperties_(p)
command and properties are used to name the report and set the report text to the contents of the theReport
, which is created by joining all string elements in rpt
using rpt.join
. Finally, the report is displayed to the user with the showBrowser()
command.
# Insert table row with husband and wife results def InsertRow(rowLabel, numHusb, sumHusb, numWife, sumWife) tr = "<tr><td>" + rowLabel + "</td><td align='" if numHusb > 0 tr = tr + "right'>%.2f" % (sumHusb / numHusb) else tr = tr + "center'>-" end tr = tr + "</td><td align='" if numWife > 0 tr = tr + "right'>%.2f" % (sumWife / numWife) else tr = tr + "center'>-" end tr = tr + "</td></tr>\n" return tr end
This subroutine formats each row of the table. The input parameters are a label for the row and numerical results to be averaged and displayed in the table. The only catch is that numHusb
or numWife
might be zero if no individuals suitable for averaging were found in the CollectAges()
subroutine. Since we do not want to divide by zero, this special case is trapped and the table cell is loaded with "-" rather then a calculated average. Average ages are displayed using two digits after the decimal by using String
class methods.
Another refinement implemented in this subroutine is to select alignment for the table cells. The label is left justified. All averages are right justified. If no data are available, the "-" is centered. When the subroutine ends, it returns the entire text for the row.